Workflow code is rarely free of dependencies. It may require python or system packages or make use of environment variables. For example, a task that downloads compressed reference data from AWS S3 will need the aws-cli and unzip APT packages, then use the pyyaml python package to read the included metadata.

The workflow environment is encapsulated in a Docker container, which is created from a recipe defined in a Dockerfile.

Latch provides automatic Dockerfile generation via the latch dockerfile command. You can pass a set of requirements files (detailed below) to this command to configure the generated Dockerfile to install specific dependencies in the environment.

Python: requirements.txt

Dependencies from a requirements.txt file can be automatically installed using pip. In order to enable this, pass the path to your requirements.txt file to the latch dockerfile command using the -p/--pip-requirements option.

Python: setup.py, PEP-621 pyproject.toml

Workflows with a package specification in a setup.py file or a PEP-621 compliant pyproject.toml file can be automatically installed using pip. Pass the path to either the setup.py or the pyproject.toml using the -i/--pyproject option.

Poetry pyproject.toml files are not supported.

Conda environment.yaml

The Conda environment in an environment.yaml file can be automatically installed using mamba env create --file with latest mamba installed via Miniforge. This environment will be used by default.

To enable, pass the path to the environment file using the -c/--conda-env option.

R: environment.R

A script in an environment.R file can be automatically executed when the Dockerfile is built. This is intended for installing dependencies but there are no actual limits on what the script does. The script is executed using rig. By default, this uses the latest R version, though you can change this by editing the run rig add release line (shown below).

To enable this, pass in the path to the R file using the -r/--r-env flag.

Note that some R packages may have system dependencies that need to be installed using APT or another method. These packages will list these dependencies in their documentation. Missing dependencies will cause crashes during workflow build or when using the packages.

System: APT

A text file containing apt dependencies can also be installed by default. Each line of the file must contain an apt package (with format consistent with what is specified here). This can be enabled by passing the path to the file with the -a/--apt-requirements option.

Environment Variables

Environment variables contained in a file can be automatically added to the workflow environment. Pass the path of the file containing the environment variables using the -d/--direnv option.

Example of Auto-generated Dockerfile

The following Dockerfile is generated in the subprocess template (using latch init --template subprocess --dockerfile example_workflow):

# latch base image + dependencies for latch SDK --- removing these will break the workflow
from 812206152185.dkr.ecr.us-west-2.amazonaws.com/latch-base:ace9-main
run pip install latch==2.12.1
run mkdir /opt/latch

# install system requirements
copy system-requirements.txt /opt/latch/system-requirements.txt
run apt-get update --yes && xargs apt-get install --yes </opt/latch/system-requirements.txt

# copy all code from package (use .dockerignore to skip files)
copy . /root/

# set environment variables
env BOWTIE2_INDEXES=reference

# latch internal tagging system + expected root directory --- changing these lines will break the workflow
arg tag
env FLYTE_INTERNAL_IMAGE $tag
workdir /root

Note on Python Requirements

The order of python requirement installation is as follows

  1. conda
  2. setup.py / pyproject.toml
  3. requirements.txt

Consequently, a package specified in the requirements.txt file will overwrite a previous install of the same packaged installed by the conda environment.

Excluding Files

By default, all files in the workflow root directory are included in the workflow build. Any unnecessary files will increase the resulting workflow container image size and increase registration and startup time proportional to their size.

To exclude files from the build use a .dockerignore. Files can be specified one at a time or using glob patterns.

The default .dockerignore includes files auto-generated by Latch.

GPU Task Limitations

Commands that require certain kernel capabilities will fail with “Permission denied” in GPU tasks (small-gpu-task, large-gpu-task, v100_x*_task, and g6e_*_task). This includes mount and chroot among others.