Using Latch Storage
In order to have your Snakemake workflow read from, and write to, Latch Data, you need to configure your Snakefile to use the Latch Storage Plugin.
In either the input:
or output:
directives of a rule, you can specify a Latch Path by marking the specific path with a storage.latch
function call. For example, the rule
will copy the local file hello.txt
onto Latch, giving it the path latch://123.account/goodbye.txt
.
Common Patterns / Pitfalls
Referencing Inputs/Outputs
Note that in the example above, the shell command never explicitly references latch://123.account/goodbye.txt
, and instead references the {output}
wildcard. This is intentional, and all rules that reference storage.latch()
objects must use this pattern to function correctly.
Snakemake storage plugins in general work by doing all operations on a local copy of the remote file, then uploading the remote file back at the end of rule execution. In the example above, the {output}
wildcard is replaced with the path of the local copy. This local copy is stored opaquely and its location can change depending on the way the pipeline is configured, so the only way to reliably reference it is by using the wildcard. This also applies to inputs and the {input}
wildcard, for the exact same reason.
Wildcards and expand(...)
Due to a limitation of Snakemake, storage.latch
must be placed outside of all expand
calls. For example:
The same applies to the directory
flag.
os.path.join(...)
vs Path(...) / ...
Properly formatted Latch Paths start with latch://
. Unfortunately, pathlib.Path
condenses multiple slashes, so something like Path("latch://123.account/hello") / "goodbye.txt"
will resolve to "latch:/123.account/hello/goodbye.txt"
(Note that it starts with latch:/
, not latch://
). To avoid this, use os.path.join
directly instead. The above becomes os.path.join("latch://123.account/hello", "goodbye.txt")
which correctly resolves to "latch://123.account/hello/goodbye.txt"
with the double slash.
Was this page helpful?