Troubleshooting

Problem	Common Solution
`The above error occured when reading the Snakefile to extract workflow metadata.`	Snakefile has errors outside of any rules. Frequently caused by missing dependencies (look for `ModuleNotFoundError` ). Either install dependencies or add a `latch_metadata.py` file
`snakemake.exceptions.WorkflowError: Workflow defines configfile config.yaml but it is not present or accessible (full checked path: /root/config.yaml)`	Include a `config.yaml` in the workflow Docker image. Currently, config files cannot be generated from workflow parameters.
`Command '['/usr/local/bin/python', '-m', 'latch_cli.snakemake.single_task_snakemake', ...]' returned non-zero exit status 1.`	The runtime single-job task failed. Look at logs to find the error. It will be marked with the string `[!] Failed` .
Runtime workflow task fails with `FileNotFoundError in file /root/workflow/Snakefile` but the file is specified in workflow parameters	Wrap the code that reads the file in a function. See section “Input Files Referenced Outside of Rules”
MultiQC `No analysis results found. Cleaning up..`	FastQC outputs two files for every FastQ file: the raw `.zip` data and the HTML report. Include the raw `.zip` outputs of FastQC in the MultiQC rule inputs. See section “Input Files Not Explicitly Defined in Rules” “

Troubleshooting: Input Files Referenced Outside of Rules

Only the JIT workflow downloads every input file. Tasks at runtime will only download files their target rules explicitly depend on. This means that Snakefile code that is not under a rule will usually fail if it tries to read input files. Example:

# ERROR: this reads a directory, regardless of which rule is executing!
samples = Path("inputs").glob("*.fastq")

rule all:
  input:
    expand("fastqc/{sample}.html", sample=samples)

rule fastqc:
  input:
    "inputs/{sample}.fastq"
  output:
    "fastqc/{sample}.html"
  shellcmd:
    fastqc {input} -o {output}

Since the Path("inputs").glob(...) call is not under any rule, it runs in all tasks. Because the fastqc rule does not specify input_dir as an input , it will not be downloaded and the code will throw an error.

Solution

Only access files when necessary (i.e. when computing dependencies as in the example, or in a rule body) by placing problematic code within rule definitions. Either directly inline the variable or write a function to use in place of the variable. Example:

rule all_inline:
  input:
    # This code will only run in the JIT step
    expand("fastqc/{sample}.html", sample=Path("inputs").glob("*.fastq"))

def get_samples():
  # This code will only run if the function is called
  samples = Path("inputs").glob("*.fastq")
  return samples

rule all_function:
  input:
    expand("fastqc/{sample}.html", sample=get_samples())

This works because the JIT step replaces input , output , params , and other declarations with static strings for the runtime workflow so any function calls within them will be replaced with pre-computed strings and the Snakefile will not attempt to read the files again. Same example at runtime:

rule all_inline:
  input:
    "fastqc/example.html"

def get_samples():
  # Note: this function is no longer called anywhere in the file
  samples = Path("inputs").glob("*.fastq")
  return samples

rule all_function:
  input:
    "fastqc/example.html"

Example using multiple return values:

def get_samples_data():
  samples = Path("inputs").glob("*.fastq")
  return {
    "samples": samples,
    "names": [x.name for x in samples]
  }

rule all:
  input:
    expand("fastqc/{sample}.html", sample=get_samples_data()["samples"]),
    expand("reports/{name}.txt", name=get_samples_data()["names"]),