Workflow Environment
Workflow code is rarely free of dependencies. It may require python or system packages or make use of environment variables. For example, a task that downloads compressed reference data from AWS S3 will need the aws-cli
and unzip
APT packages, then use the pyyaml
python package to read the included metadata.
The workflow environment is encapsulated in a Docker container, which is created from a recipe defined in a Dockerfile.
Latch provides automatic Dockerfile generation via the latch dockerfile
command. You can pass a set of requirements files (detailed below) to this command to configure the generated Dockerfile to install specific dependencies in the environment.
Python: requirements.txt
Dependencies from a requirements.txt
file can be automatically installed using pip
. In order to enable this, pass the path to your requirements.txt
file to the latch dockerfile
command using the -p/--pip-requirements
option.
Python: setup.py
, PEP-621 pyproject.toml
Workflows with a package specification in a setup.py
file or a PEP-621 compliant pyproject.toml
file can be automatically installed using pip
. Pass the path to either the setup.py
or the pyproject.toml
using the -i/--pyproject
option.
Poetry pyproject.toml
files are not supported.
Conda environment.yaml
The Conda environment in an environment.yaml
file can be automatically installed using mamba env create --file
with latest mamba installed via Miniforge. This environment will be used by default.
To enable, pass the path to the environment file using the -c/--conda-env
option.
R: environment.R
A script in an environment.R
file can be automatically executed when the Dockerfile is built. This is intended for installing dependencies but there are no actual limits on what the script does. The script is executed using rig
. By default, this uses the latest R
version, though you can change this by editing the run rig add release
line (shown below).
To enable this, pass in the path to the R file using the -r/--r-env
flag.
Note that some R packages may have system dependencies that need to be installed using APT or another method. These packages will list these dependencies in their documentation. Missing dependencies will cause crashes during workflow build or when using the packages.
System: APT
A text file containing apt
dependencies can also be installed by default. Each line of the file must contain an apt package (with format consistent with what is specified here). This can be enabled by passing the path to the file with the -a/--apt-requirements
option.
Environment Variables
Environment variables contained in a file can be automatically added to the workflow environment. Pass the path of the file containing the environment variables using the -d/--direnv
option.
Example of Auto-generated Dockerfile
The following Dockerfile is generated in the subprocess
template (using latch init --template subprocess --dockerfile example_workflow
):
Note on Python Requirements
The order of python requirement installation is as follows
conda
setup.py
/pyproject.toml
requirements.txt
Consequently, a package specified in the requirements.txt
file will overwrite a previous install of the same packaged installed by the conda
environment.
Excluding Files
By default, all files in the workflow root directory are included in the workflow build. Any unnecessary files will increase the resulting workflow container image size and increase registration and startup time proportional to their size.
To exclude files from the build use a .dockerignore
. Files can be specified one at a time or using glob patterns.
The default .dockerignore
includes files auto-generated by Latch.
GPU Task Limitations
Commands that require certain kernel capabilities will fail with “Permission denied” in GPU tasks (small-gpu-task
, large-gpu-task
, v100_x*_task
, and g6e_*_task
). This includes mount
and chroot
among others.
Was this page helpful?