Prerequisites
- Register for an account and log into the Latch Console
- Install a compatible version of Python. The Latch SDK is currently only supported for Python
>=3.8
and<=3.11
- Install the Latch SDK
== 2.62.1a2
Step 1: Clone your Snakemake workflow
We will use the snakemake-v2-tutorial as an example; however, feel free to follow along with any Snakemake workflow.Step 2: Configure workflow resources and containers
Before deploying to Latch, we need to specify resource requirements for each job in the workflow. Since this is a relatively low footprint pipeline, we can make each machine small and provide 1 core and 2 GiB of RAM.You only need to do this if your Snakefile rules don’t already have resources defined. This profile serves as a fallback for rules without explicit resource specifications.
profiles/default
and in it create a file called config.yaml
:
config.yaml
:
Step 3: Define metadata and workflow graphical interface
The input parameters need to be explicitly defined to construct a graphical interface for a Snakemake workflow. These parameters will be exposed to scientists in a web interface once the workflow is uploaded to Latch. First, make a directory calledlatch_metadata
and in it create a file called __init__.py
:
latch_metadata/__init__.py
, create a SnakemakeV2Metadata
object as below:
config.yaml
file, we see that the pipeline expects 3 config parameters: samples_dir
, genome_dir
, and results_dir
. The former two are inputs to the pipeline and the latter is the location where outputs will be stored.
We want all three of these to be exposed in the UI, so we will add them to the parameters
dict in latch_metadata/__init__.py
:
LatchDir
s (we made results_dir
a LatchOutputDir
because it is an output directory).
For now, this is all we need and we can move on, but if you like feel free to customize the metadata object further using the interface described here.
Let’s inspect the most relevant fields of the SnakemakeV2Metadata
object:
-
display_name
: The display name of the workflow, as it will appear on the Latch UI. -
author
: Name of the person or organization that publishes the workflow -
parameters
: Input parameters to the workflow, defined asSnakemakeParameter
objects. The Latch Console will expose these parameters to scientists before they execute the workflow.
Step 4: Generate the entrypoint
Now we need to generate the entrypoint file containing the Latch workflow wrapping our Snakemake workflow. This is a simple command:wf
containing a file called entrypoint.py
. The file should have the following contents:
wf/entrypoint.py
wf/entrypoint.py
Step 5: Generate the Dockerfile
The last step pre-registering is to generate theDockerfile
that will define the environment the runtime executes in. In particular, we want that environment to contain the conda environment defined by environment.yaml
.
Again, we can accomplish this with a simple command:
Dockerfile
with the following contents:
Dockerfile
Dockerfile
Step 6: Register the workflow
To register a Snakemake workflow on Latch, type:latch register
command searches for a Latch workflow in the current directory and registers it to Latch.
After running the above command, the Latch SDK will generate the necessary files and upload your workflow to Latch.
Once the workflow is registered, navigate to the Workflows tab in the Latch Console and select the workflow you previously registered.
Step 7: Execute the workflow
Before executing the workflow, we need to upload test data to Latch. You can upload the data usinglatch cp
:
snakemake-tutorial-data
in your account on Latch.
Now, navigate to the Workflows tab in the Latch Console and select the workflow you previously registered.
Then, select the appropriate input parameters from the test data you uploaded and click Launch Workflow
to execute the workflow.
Step 8: Monitoring the workflow
After launching the workflow, you can monitor progress by clicking on the appropriate execution under theExecutions
tab of your workflow.
Under the Graph & Logs
tab, you can view the generated DAG and monitor the execution of your Snakemake workflow.
Once the workflow starts executing, you can monitor the status of each rule in the workflow through the Latch Console interface.
Step 9 (Optional): Customizing the workflow using dynamic sample detection
You may have noticed that in the Snakefile, the sample names are hardcoded. This is obviously not desirable - we should be able to infer the sample names based on the contents of the Sample directory. In order to accomplish this, we will need to edit both the Snakefile, and theentrypoint
file itself. Since we need to know the contents of the Sample directory outside of a rule, we will need to stage it locally before the pipeline executes.
First, add the following import to the top of the wf/entrypoint.py
file:
snakemake_runtime(...)
so that it is the following:
samples_dir
before calling snakemake
- this way we will know the contents of the directory without needing to be in a rule.
Lastly, we will need to edit the Snakefile and remove the hardcoded samples:
What You’ve Learned
Core Concepts:- Latch’s Snakemake integration allows running Snakemake workflows with a graphical web interface.
- Metadata definition creates the parameter interface that scientists will use to configure and run workflows.
- Clone your Snakemake workflow
- Define metadata to create the workflow’s parameter interface.
- Generate entrypoint with
latch snakemake generate-entrypoint .
to create the Latch wrapper. - Generate Dockerfile with
latch dockerfile --snakemake -c environment.yaml . -f
for the execution environment. - Register the pipeline with
latch register -y .
. - Upload test data to Latch and select inputs from the Console.
- Monitor execution through the Graph & Logs interface.
- Customize the generated
entrypoint.py
for additional pre- or post-processing logic.
Next Steps
- Explore custom workflow interfaces
- Learn about testing and debugging workflows