CRISPResso2 is a software pipeline for the analysis of genome editing experiments. It is designed to enable rapid and intuitive interpretation of results produced by amplicon sequencing.
Briefly, CRISPResso2
aligns sequencing reads to a reference sequence
quantifies insertions, mutations and deletions to determine whether a read is modified or unmodified by genome editing
summarizes editing results in intuitive plots and datasets
Find CRISPResso2 in “All Workflows” and open the workflow
Enter the parameters for CRISPResso2
First add your Fastq read files, make sure they have been uploaded to Latch in the Data tab and then you can select them in the modal.
If your files are single end reads then you only have to select a file for the Read 1 parameter.
If the file you selected is interleaved (paired end reads in a single fastq file) make sure to enable the Read 1 is Interleaved parameter.
If you have paired end reads then make sure to also select a file for the Read 2 parameter.
The add the Amplicon Sequences used, if you have multiple click the plus button to add an additional sequences .
You can add a name for each amplicon sequence given. If you have multiple amplicon make sure the number and order of the names correspond to the amplicons given above. By default CRISPResso uses “Reference” as the amplicon sequence name.
(Optional) Then add your Guide Sequences (sgRNA).
Same as with amplicons, you can add a name for each guide sequence given. If you have multiple guides make sure the number and order of the names correspond to the guides given above.
Then fill out the Output Prefix and Output Location and click Launch Workflow.
Within no time your results will show up in the Data tab!
If you want to run multiple executions of this workflow click the large plus button at the bottom of the parameters to add an additional execution of the count workflow.
We have hidden many of the optional parameters under Hidden Parameters, you can click that if you would like to fine tune your execution run or want to use any of the advanced parameters.
Marks the file in Read 1 as containing interleaved reads. CRISPResso will split the paired end reads into two files before running. Do not enable this if Read 2 has an input.
The second Fastq file for paired end reads, if both Read 1 and Read 2 contain Fastq files CRISPResso will assume they contain paired end reads and run it as such.
A name for the reference amplicon can be given, multiple names can be specified here and the order must correspond to the amplicon sequences given above.
sgRNAs should be input as the guide RNA sequence (usually 20 nt) immediately adjacent to but not including the PAM sequence (5’ (left) of NGG for SpCas9). If the sgRNA is not provided, quantification may include modifications far from the predicted editing site and may result in overestimation fo editing rates.
The directory where the files produced by this workflow will be placed. A path can either be selected or if a new path is typed in field Latch will automatically create the folders in the data viewer.
Sequences must have at least this homology percentage score with the amplicon to be aligned. After reads are aligned to a reference sequence, the homology is calculated as the number of bp they have in common. If the aligned read has a homology less than this parameter, it is discarded. This is useful for filtering erroneous reads that do not align to the target amplicon, for example arising from alternate primer locations.
Discard Guide Positions that Extend Beyond end of Amplicon
If set, for guides that align to multiple positions, guide positions will be discarded if plotting around those regions would included bp that extend beyond the end of the amplicon.
Quantification Window Center [Base Pairs Relative to 3’ of sgRNA]
Center of quantification window to use within respect to the 3’ end of the provided sgRNA sequence. Remember that the sgRNA sequence must be entered without the PAM. For cleaving nucleases, this is the predicted cleavage position.
Defines the size (in bp) of the quantification window extending from the position specified by the Quantification Window Center parameter in relation to the provided guide RNA sequences. Mutations within this number of base pairs from the quantification window center are used in classifying reads as modified or unmodified. For example setting this to 1bp extends the window on each side of the cleavage position for a total length of 2bp. Disabling this window (setting it to 0) causes all indels in the entire amplicon to be considered.
This defines the size of the window extending from the quantification window center to plot. Nucleotides within the Plot Window Size of the Quantification Window Center for each guide are plotted.
Include this instead of sgRNA when doing analysis on Prime Editing. pegRNA spacer sgRNA sequence used in prime editing. The spacer should not include the PAM sequence. The sequence should be given in the RNA 5’->3’ order, so for Cas9, the PAM would be on the right side of the given sequence.
Extension sequence used in prime editing. The sequence should be given in the RNA 5’->3’ order, such that the sequence starts with the RT template including the edit, followed by the Primer-binding site (PBS).
Quantification window size (in bp) at flap site for measuring modifications anchored at the right side of the extension sequence. Similar to the Quantification Window parameter, the total length of the quantification window will be 2x this parameter.
The nicking sgRNA sequence used in prime editing. The sgRNA should not include the PAM sequence. The sequence should be given in the RNA 5’->3’ order, so for Cas9, the PAM would be on the right side of the sequence.
If given, reads containing any of this scaffold sequence before the Prime Editing Extension Sequence will be classified as ‘Scaffold-incorporated’. The sequence should be given in the 5’->3’ order such that the RT template directly follows this sequence. A common value ends with ‘GGCACCGAGUCGGUGC’.
If given, this sequence will be used as the prime-edited reference sequence. This may be useful if the prime-edited reference sequence has large indels or the algorithm cannot otherwise infer the correct reference sequence.
Enable this when doing analysis on Base Editing. Will output plots showing the frequency of substitutions in the quantification window are generated. The target and result bases can also be set to measure the rate of on-target conversion at bases in the quantification window.
Amplicon sequence expected after HDR. The expected HDR amplicon sequence can be provided to quantify the number of reads showing a successful HDR repair.
Subsequences of the amplicon sequence covering one or more coding sequences for frameshift analysis. Sequences of exons within the amplicon sequence can be provided to enable frameshift analysis and splice site analysis by CRISPResso2. Users should provide the subsequences of the reference amplicon sequence that correspond to coding sequences and not the whole exon sequences.
If more than one reference amplicon is given, reads that align to multiple reference amplicons will count equally toward each amplicon. Default behavior is to exclude ambiguous alignments.
Use stringent parameters for flash merging. In the case where flash could merge R1 and R2 reads ambiguously, the expected overlap is calculated as 2 \* Average Read Length - Amplicon Length.
The flash parameters for for minimum and maximum overlap above will be set to prefer merged reads with length within 10bp of the expected overlap. These values override the Minimum Length Required for Confident Overlap Between Two Reads or Minimum Overlap Length Expected in ~90% of Read Pairs CRISPResso parameters.
Minimum % Reads Required To Report an Allele in Table Plot
Minimum % reads required to report an allele in the alleles table plot. This parameter only affects plotting. All alleles will be reported in data files.
Show Percentage As Reads Aligned to Assigned Reference
If set, in the allele plots, the percentages will show the percentage as a percent of reads aligned to the assigned reference. Default behavior is to show percentage as a percent of all reads.
If set, alleles with different modifications in the quantification window (but not necessarily in the plotting window (e.g. for another sgRNA)) are plotted on separate lines, even though they may have the same apparent sequence. To force the allele plot and the allele table to be the same, set this parameter. If unset, all alleles with the same sequence will be collapsed into one row.
Output as in CRISPResso1. In particular, if this flag is set, the old output files ‘Mapping_statistics.txt’, and ‘Quantification_of_editing_frequency.txt’ are created, and the new files ‘nucleotide_frequency_table.txt’ and ‘substitution_frequency_table.txt’ and figure 2a and 2b are suppressed, and the files ‘selected_nucleotide_percentage_table.txt’ are not produced when Base Editor Output is enabled.
Aligned reads for processing in bam format. This parameter can be given instead of fastq_r1 to specify that reads are to be taken from this bam file. An output bam is produced that contains an additional field with CRISPResso2 information.
The output of CRISPResso2 consists of a set of informative graphs that allow for the quantification and visualization of the position and type of outcomes within an amplicon sequence.The main output file is CRISPResso2_report.html which is a summary report that can be viewed in a web browser containing all of the output plots and summary statistics.You can view a more detailed explanations of the outputs at the CRISPResso Manual.