Upload your ATAC Seq data

1

Navigate to the Latch Data tab on the left panel.

Use the ‘Upload’ modal found on the top right, and upload your raw .fastq or .fastq.gz files to Latch Data with the upload modal. Alternatively, you can also use the command line interface to upload your data.

Set up sample sheet on Latch Registry

  1. Navigate to the Latch Registry tab on the left panel
  2. Create a new table.
1

Create a new table.

2

Bulk import sequencing run.

Navigate to import and select Bulk Import Sequencing Run and navigate to the directory containing sequencing data.

3

Automatically parse sample names.

4

Create Samplesheet

Create a samplesheet and add a column to include replicate numbers

Launch NFCore/ATACSeq Workflow

  1. Navigate to the Latch Workflows tab on the left panel.

  2. Choose the ‘nf-core/atacseq’ workflow in ‘All Workflows’ or My Workflows’ (if you have added it).

  3. Import rows from the registry.

  4. Specify the reference genome. Choose from the latch-verified custom reference genome or input your own reference genome.

    1

    Latch Verified Reference Genome

    2

    Custom Reference Genome

  5. Specify the aligner used to align reads to the reference genome.

  6. Specify options used to run MACS2, which is a peak calling software.

  7. After configuring the workflow parameters, hit launch workflow

Results from running the NFCore/ATACSeq workflow

Key Takeaways

  1. The results from the ATAC Seq library preparation are available in the [output directory]/[run name] on ldata
  2. The workflow produces alignments, peak files, bigwig files that can loaded into IGV, and peak annotations with HOMER.
  3. Sample level analysis can be found at,
    • The peak files are found at [output directory]/[run name]/[aligner name]/merged_replicate/macs2/broad_peak/[sample id].mRp.clN_peaks.xls
    • The peak annotations are found at [output directory]/[run name]/[aligner name]/merged_replicate/macs2/broad_peak/[sample id].mRp.clN_peaks.annotatePeaks.txt
    • The bigwig files that can be viewed with IGV can be at [output directory]/[run name]/[aligner name]/merged_replicate/macs2/bigwig/[sample id].mRp.clN.bigWig
  4. Replicate level analysis can be found at,
    • The peak files are found at [output directory]/[run name]/[aligner name]/merged_replicate/macs2/bigwig/[sample id].mRp.clN_peaks.xls
    • The peak annotations are found at [output directory]/[run name]/[aligner name]/merged_replicate/macs2/bigwig/[sample id].mRp.clN_peaks.annotatePeaks.txt
    • The bigwig files that can be viewed with IGV can be at [output directory]/[run name]/[aligner name]/merged_replicate/macs2/bigwig/[sample id].mRp.clN.bigWig
  5. Results from running differential accessibility analysis can be found here at,
    • Results from PCA can be found here at, [output directory]/[run name]/[aligner name]/merged_library/macs2/broad_peak/consensus/deseq2/consensus_peaks.mLb.clN.pca.vals.txt
    • The PCA distance matrix is available here at, [output directory]/[run name]/[aligner name]/merged_library/macs2/broad_peak/consensus/deseq2/consensus_peaks.mLb.clN.sample.dists.txt

Supplementary Results

  1. The intermediate alignments, fastqc reports, multiqc reports, genome information, and the trimmed read information is available here at, [output directory]/[run name]/[aligner name], [output directory]/[run name]/fastqc, [output directory]/[run name]/genome, [output directory]/[run name]/multiqc, and [output directory]/[run name]/trimgalore
  2. The pipeline also creates two other directories namely [output directory]/[run name]/R_Plots and [output directory]/[run name]/cov_parquet/ that hosts the data matrices required for plotting results with the Latch Verified ATACSeq Plots layout.

Sample registry tables

Further, the workflow creates a table in the registry with the run name within the project “ATAC_Seq_Results”. This table carries data tables computed as a part of the workflow that can be loaded with the verified ATAC Seq plots layout, which helps visualize the results from the workflow.

Plotting Layout

  1. Create a new plotting template by choosing the Verified ATAC Seq Plots Layout from the list of available plots layout,

  2. Load the data matrices by loading the registry table created as a part of the workflow,

  3. The plotting layout automatically loads all the dataframes needed to make plots and produces the following plots.

  4. The plotting layout further makes it very easy to visualize peaks across samples, and provides functionality to search by genes and chromosomes.