> ## Documentation Index
> Fetch the complete documentation index at: https://wiki.latch.bio/llms.txt
> Use this file to discover all available pages before exploring further.

# Tutorial

> Learn how to upload a Snakemake workflow on Latch.

## Prerequisites

* Register for an account and log into the [Latch Console](https://console.latch.bio)
* Install a compatible version of Python. The Latch SDK is currently only supported for Python `>=3.8` and `<=3.11`
* Install the Latch SDK `== 2.62.1a2`

Example on Ubuntu:

```bash theme={null}
mamba create -n env python=3.11 -n "latch-snakemake"
mamba activate latch-snakemake
pip install latch==2.62.1a2
```

## Step 1: Clone your Snakemake workflow

We will use the [snakemake-v2-tutorial](https://github.com/latchbio/snakemake-v2-tutorial) as an example; however, feel free to follow along with any Snakemake workflow.

```bash theme={null}
git clone https://github.com/latchbio/snakemake-v2-tutorial.git
cd snakemake-v2-tutorial
```

## Step 2: Configure workflow resources and containers

Before deploying to Latch, we need to specify resource requirements for each job in the workflow. Since this is a relatively low footprint pipeline, we can make each machine small and provide 1 core and 2 GiB of RAM.

<Tip>
  You only need to do this if your Snakefile rules don't already have resources defined. This profile serves as a fallback for rules without explicit resource specifications.
</Tip>

Create a directory called `profiles/default` and in it create a file called `config.yaml`:

```bash theme={null}
mkdir -p profiles/default
touch profiles/default/config.yaml
```

Then, add the following YAML content to the `config.yaml`:

```yaml theme={null}
default-resources:
  cpu: 1
  mem_mib: 2048
```

This will set the default resources for every rule. Note that you can override these for any rule by updating the resources of that rule directly.

## Step 3: Define metadata and workflow graphical interface

The input parameters need to be explicitly defined to construct a graphical interface for a Snakemake workflow. These parameters will be exposed to scientists in a web interface once the workflow is uploaded to Latch.

First, make a directory called `latch_metadata` and in it create a file called `__init__.py`:

```bash theme={null}
mkdir latch_metadata
touch latch_metadata/__init__.py
```

In `latch_metadata/__init__.py`, create a `SnakemakeV2Metadata` object as below:

```python theme={null}
from latch.types.directory import LatchDir, LatchOutputDir
from latch.types.metadata.latch import LatchAuthor
from latch.types.metadata.snakemake import SnakemakeParameter
from latch.types.metadata.snakemake_v2 import SnakemakeV2Metadata

metadata = SnakemakeV2Metadata(
    display_name="Snakemake Tutorial Workflow",
    author=LatchAuthor(),
    parameters={},
)
```

This object still doesn't have any parameter metadata yet, so we need to add it. Looking at the workflow configuration in the `config.yaml` file, we see that the pipeline expects 3 config parameters: `samples_dir`, `genome_dir`, and `results_dir`. The former two are inputs to the pipeline and the latter is the location where outputs will be stored.

We want all three of these to be exposed in the UI, so we will add them to the `parameters` dict in `latch_metadata/__init__.py`:

```python theme={null}
from latch.types.directory import LatchDir, LatchOutputDir
from latch.types.metadata.latch import LatchAuthor
from latch.types.metadata.snakemake import SnakemakeParameter
from latch.types.metadata.snakemake_v2 import SnakemakeV2Metadata

metadata = SnakemakeV2Metadata(
    display_name="Snakemake Tutorial Workflow",
    author=LatchAuthor(),
    parameters={
        "samples_dir": SnakemakeParameter(
            display_name="Sample Directory",
            type=LatchDir,
        ),
        "genome_dir": SnakemakeParameter(
            display_name="Genome Directory",
            type=LatchDir,
        ),
        "results_dir": SnakemakeParameter(
            display_name="Output Directory",
            type=LatchOutputDir,
        ),
    },
)
```

In each parameter, we specified (1) a human-readable name to display in the UI, and (2) the type of parameter to accept. Since the workflow expects all of these to be directories, they are all `LatchDir`s (we made `results_dir` a `LatchOutputDir` because it is an output directory).

For now, this is all we need and we can move on, but if you like feel free to customize the metadata object further using the interface described [here](/workflows/sdk/ui/latch-metadata).

Let's inspect the most relevant fields of the `SnakemakeV2Metadata` object:

* **`display_name`**: The display name of the workflow, as it will appear on the Latch UI.

* **`author`**: Name of the person or organization that publishes the workflow

* **`parameters`**: Input parameters to the workflow, defined as `SnakemakeParameter` objects. The Latch Console will expose these parameters to scientists before they execute the workflow.

## Step 4: Generate the entrypoint

Now we need to generate the entrypoint file containing the Latch workflow wrapping our Snakemake workflow. This is a simple command:

```bash theme={null}
latch snakemake generate-entrypoint .
```

This should create a directory called `wf` containing a file called `entrypoint.py`. The file should have the following contents:

<Accordion title="wf/entrypoint.py">
  ```python theme={null}
  import json
  import os
  import shutil
  import subprocess
  import sys
  import typing
  from dataclasses import dataclass
  from enum import Enum
  from pathlib import Path

  import requests
  import typing_extensions
  from latch.resources.tasks import custom_task, snakemake_runtime_task
  from latch.resources.workflow import workflow
  from latch.types.directory import LatchDir, LatchOutputDir
  from latch.types.file import LatchFile
  from latch_cli.services.register.utils import import_module_by_path
  from latch_cli.snakemake.v2.utils import get_config_val

  import_module_by_path(Path("latch_metadata/__init__.py"))

  import latch.types.metadata.snakemake_v2 as smv2


  @custom_task(cpu=0.25, memory=0.5, storage_gib=1)
  def initialize() -> str:
    token = os.environ.get("FLYTE_INTERNAL_EXECUTION_ID")
    if token is None:
        raise RuntimeError("failed to get execution token")

    headers = {"Authorization": f"Latch-Execution-Token {token}"}

    print("Provisioning shared storage volume... ", end="")
    resp = requests.post(
        "http://nf-dispatcher-service.flyte.svc.cluster.local/provision-storage-ofs",
        headers=headers,
        json={
            "storage_expiration_hours": 0,
            "version": 2,
            "snakemake": True,
        },
    )
    resp.raise_for_status()
    print("Done.")

    return resp.json()["name"]


  @snakemake_runtime_task(cpu=1, memory=2, storage_gib=50)
  def snakemake_runtime(
    pvc_name: str,
    samples_dir: LatchDir,
    genome_dir: LatchDir,
    results_dir: LatchOutputDir,
  ):
    print(f"Using shared filesystem: {pvc_name}")

    shared = Path("/snakemake-workdir")
    snakefile = shared / "Snakefile"

    config = {
        "samples_dir": get_config_val(samples_dir),
        "genome_dir": get_config_val(genome_dir),
        "results_dir": get_config_val(results_dir),
    }

    config_path = (shared / "__latch.config.json").resolve()
    config_path.write_text(json.dumps(config, indent=2))

    ignore_list = [
        "latch",
        ".latch",
        ".git",
        "nextflow",
        ".nextflow",
        ".snakemake",
        "results",
        "miniconda",
        "anaconda3",
        "mambaforge",
    ]

    shutil.copytree(
        Path("/root"),
        shared,
        ignore=lambda src, names: ignore_list,
        ignore_dangling_symlinks=True,
        dirs_exist_ok=True,
    )

    cmd = [
        "snakemake",
        "--snakefile",
        str(snakefile),
        "--configfile",
        str(config_path),
        "--executor",
        "latch",
        "--default-storage-provider",
        "latch",
        "--jobs",
        "1000",
    ]

    print("Launching Snakemake Runtime")
    print(" ".join(cmd), flush=True)

    failed = False
    try:
        subprocess.run(cmd, cwd=shared, check=True)
    except subprocess.CalledProcessError:
        failed = True
    finally:
        if not failed:
            return

        sys.exit(1)


  @workflow(smv2._snakemake_v2_metadata)
  def snakemake_v2_snakemake_tutorial_workflow(
    samples_dir: LatchDir, genome_dir: LatchDir, results_dir: LatchOutputDir
  ):
    """
    Sample Description
    """

    snakemake_runtime(
        pvc_name=initialize(),
        samples_dir=samples_dir,
        genome_dir=genome_dir,
        results_dir=results_dir,
    )
  ```
</Accordion>

## Step 5: Generate the Dockerfile

The last step pre-registering is to generate the `Dockerfile` that will define the environment the runtime executes in. In particular, we want that environment to contain the conda environment defined by `environment.yaml`.

Again, we can accomplish this with a simple command:

```bash theme={null}
latch dockerfile --snakemake -c environment.yaml . -f
```

This will generate a file called `Dockerfile` with the following contents:

<Accordion title="Dockerfile">
  ```Dockerfile theme={null}
  # Prologue
  # DO NOT CHANGE
  from 812206152185.dkr.ecr.us-west-2.amazonaws.com/latch-base:fe0b-main

  workdir /tmp/docker-build/work/

  shell [ \
      "/usr/bin/env", "bash", \
      "-o", "errexit", \
      "-o", "pipefail", \
      "-o", "nounset", \
      "-o", "verbose", \
      "-o", "errtrace", \
      "-O", "inherit_errexit", \
      "-O", "shift_verbose", \
      "-c" \
  ]
  env TZ='Etc/UTC'
  env LANG='en_US.UTF-8'

  arg DEBIAN_FRONTEND=noninteractive

  # Install Mambaforge
  run apt-get update --yes && \
      apt-get install --yes curl git && \
      curl \
          --location \
          --fail \
          --remote-name \
          https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh && \
      `# Docs for -b and -p flags: https://docs.anaconda.com/anaconda/install/silent-mode/#linux-macos` \
      bash Miniforge3-Linux-x86_64.sh -b -p /opt/conda -u && \
      rm Miniforge3-Linux-x86_64.sh

  # Set conda PATH
  env PATH=/opt/conda/bin:$PATH
  run conda config --set auto_activate_base false

  # Build conda environment
  copy environment.yaml /opt/latch/environment.yaml
  run mamba env create \
      --file /opt/latch/environment.yaml \
      --name environment
  env PATH=/opt/conda/envs/environment/bin:$PATH

  # Copy workflow data (use .dockerignore to skip files)
  copy . /root/

  # Epilogue

  # Latch SDK
  # DO NOT REMOVE
  run pip install "latch[snakemake]"==2.55.0.a6

  # Latch workflow registration metadata
  # DO NOT CHANGE
  arg tag
  # DO NOT CHANGE
  env FLYTE_INTERNAL_IMAGE $tag

  workdir /root
  ```
</Accordion>

## Step 6: Register the workflow

To register a Snakemake workflow on Latch, type:

```bash theme={null}
latch login
latch register -y .
```

The `latch register` command searches for a Latch workflow in the current directory and registers it to Latch.

After running the above command, the Latch SDK will generate the necessary files and upload your workflow to Latch.

Once the workflow is registered, navigate to the [Workflows tab](https://console.latch.bio/workflows) in the Latch Console and select the workflow you previously registered.

## Step 7: Execute the workflow

Before executing the workflow, we need to upload test data to Latch. You can upload the data using `latch cp`:

```bash theme={null}
latch cp data latch:///snakemake-tutorial-data
```

This will upload the data to a folder called `snakemake-tutorial-data` in your account on Latch.

Now, navigate to the [Workflows tab](https://console.latch.bio/workflows) in the Latch Console and select the workflow you previously registered.
Then, select the appropriate input parameters from the test data you uploaded and click `Launch Workflow` to execute the workflow.

## Step 8: Monitoring the workflow

After launching the workflow, you can monitor progress by clicking on the appropriate execution under the `Executions` tab of your workflow.

Under the `Graph & Logs` tab, you can view the generated DAG and monitor the execution of your Snakemake workflow.

Once the workflow starts executing, you can monitor the status of each rule in the workflow through the Latch Console interface.

## Step 9 (Optional): Customizing the workflow using dynamic sample detection

You may have noticed that in the Snakefile, the sample names are hardcoded. This is obviously not desirable - we should be able to infer the sample names based on the contents of the Sample directory.

In order to accomplish this, we will need to edit both the Snakefile, and the `entrypoint` file itself. Since we need to know the contents of the Sample directory outside of a rule, we will need to stage it locally before the pipeline executes.

First, add the following import to the top of the `wf/entrypoint.py` file:

```python theme={null}
from latch.ldata.path import LPath
```

Next, edit the start of `snakemake_runtime(...)` so that it is the following:

```python theme={null}
@snakemake_runtime_task(cpu=1, memory=2, storage_gib=50)
def snakemake_runtime(
    pvc_name: str,
    samples_dir: LatchDir,
    genome_dir: LatchDir,
    results_dir: LatchOutputDir,
):
    print(f"Using shared filesystem: {pvc_name}")

    shared = Path("/snakemake-workdir")
    snakefile = shared / "Snakefile"

    # Staging samples_dir
    local_samples_dir = LPath(samples_dir.remote_path).download(shared / "samples")

    config = {
        "samples_dir": get_config_val(local_samples_dir),
        "genome_dir": get_config_val(genome_dir),
        "results_dir": get_config_val(results_dir),
    }

    ...
```

Here we explicitly download the `samples_dir` before calling `snakemake` - this way we will know the contents of the directory without needing to be in a rule.

Lastly, we will need to edit the Snakefile and remove the hardcoded samples:

```python theme={null}
# Replace SAMPLES = ["A", "B"] with the following:

SAMPLES = []
for sample in samples_dir.iterdir():
    SAMPLES.append(sample.stem)
```

Now just re-register and see all 3 samples be run through the pipeline.

```bash theme={null}
latch register .
```

***

## What You've Learned

**Core Concepts:**

* **Latch's Snakemake integration** allows running Snakemake workflows with a graphical web interface.
* **Metadata definition** creates the parameter interface that scientists will use to configure and run workflows.

**Development Workflow:**

1. Clone your Snakemake workflow
2. Define metadata to create the workflow's parameter interface.
3. Generate entrypoint with `latch snakemake generate-entrypoint .` to create the Latch wrapper.
4. Generate Dockerfile with `latch dockerfile --snakemake -c environment.yaml . -f` for the execution environment.
5. Register the pipeline with `latch register -y .`.
6. Upload test data to Latch and select inputs from the Console.
7. Monitor execution through the Graph & Logs interface.
8. Customize the generated `entrypoint.py` for additional pre- or post-processing logic.

## Next Steps

* Explore [custom workflow interfaces](/workflows/sdk/python/customizing-your-interface/overview)
* Learn about [testing and debugging workflows](/workflows/sdk/testing-and-debugging-a-workflow/overview)
