Warning
This site has been replaced by the new QIIME 2 “amplicon distribution” documentation, as of the 2025.4 release of QIIME 2. You can still access the content from the “old docs” here for the QIIME 2 2024.10 and earlier releases, but we recommend that you transition to the new documentation at https://amplicon-docs.qiime2.org. Content on this site is no longer updated and may be out of date.
Are you looking for:
the QIIME 2 homepage? That’s https://qiime2.org.
learning resources for microbiome marker gene (i.e., amplicon) analysis? See the QIIME 2 amplicon distribution documentation.
learning resources for microbiome metagenome analysis? See the MOSHPIT documentation.
installation instructions, plugins, books, videos, workshops, or resources? See the QIIME 2 Library.
general help? See the QIIME 2 Forum.
Old content beyond this point… 👴👵
denoise-paired: Denoise and dereplicate paired-end sequences¶
Docstring:
Usage: qiime dada2 denoise-paired [OPTIONS] This method denoises paired-end sequences, dereplicates them, and filters chimeras. Inputs: --i-demultiplexed-seqs ARTIFACT SampleData[PairedEndSequencesWithQuality] The paired-end demultiplexed sequences to be denoised. [required] Parameters: --p-trunc-len-f INTEGER Position at which forward read sequences should be truncated due to decrease in quality. This truncates the 3' end of the of the input sequences, which will be the bases that were sequenced in the last cycles. Reads that are shorter than this value will be discarded. After this parameter is applied there must still be at least a 12 nucleotide overlap between the forward and reverse reads. If 0 is provided, no truncation or length filtering will be performed [required] --p-trunc-len-r INTEGER Position at which reverse read sequences should be truncated due to decrease in quality. This truncates the 3' end of the of the input sequences, which will be the bases that were sequenced in the last cycles. Reads that are shorter than this value will be discarded. After this parameter is applied there must still be at least a 12 nucleotide overlap between the forward and reverse reads. If 0 is provided, no truncation or length filtering will be performed [required] --p-trim-left-f INTEGER Position at which forward read sequences should be trimmed due to low quality. This trims the 5' end of the input sequences, which will be the bases that were sequenced in the first cycles. [default: 0] --p-trim-left-r INTEGER Position at which reverse read sequences should be trimmed due to low quality. This trims the 5' end of the input sequences, which will be the bases that were sequenced in the first cycles. [default: 0] --p-max-ee-f NUMBER Forward reads with number of expected errors higher than this value will be discarded. [default: 2.0] --p-max-ee-r NUMBER Reverse reads with number of expected errors higher than this value will be discarded. [default: 2.0] --p-trunc-q INTEGER Reads are truncated at the first instance of a quality score less than or equal to this value. If the resulting read is then shorter than `trunc-len-f` or `trunc-len-r` (depending on the direction of the read) it is discarded. [default: 2] --p-min-overlap INTEGER The minimum length of the overlap required for Range(4, None) merging the forward and reverse reads. [default: 12] --p-pooling-method TEXT Choices('independent', 'pseudo') The method used to pool samples for denoising. "independent": Samples are denoised indpendently. "pseudo": The pseudo-pooling method is used to approximate pooling of samples. In short, samples are denoised independently once, ASVs detected in at least 2 samples are recorded, and samples are denoised independently a second time, but this time with prior knowledge of the recorded ASVs and thus higher sensitivity to those ASVs. [default: 'independent'] --p-chimera-method TEXT Choices('consensus', 'none', 'pooled') The method used to remove chimeras. "none": No chimera removal is performed. "pooled": All reads are pooled prior to chimera detection. "consensus": Chimeras are detected in samples individually, and sequences found chimeric in a sufficient fraction of samples are removed. [default: 'consensus'] --p-min-fold-parent-over-abundance NUMBER The minimum abundance of potential parents of a sequence being tested as chimeric, expressed as a fold-change versus the abundance of the sequence being tested. Values should be greater than or equal to 1 (i.e. parents should be more abundant than the sequence being tested). This parameter has no effect if chimera-method is "none". [default: 1.0] --p-allow-one-off / --p-no-allow-one-off Bimeras that are one-off from exact are also identified if the `allow-one-off` argument is TrueIf True, a sequence will be identified as bimera if it is one mismatch or indel away from an exact bimera. [default: False] --p-n-threads NTHREADS The number of threads to use for multithreaded processing. If 0 is provided, all available cores will be used. [default: 1] --p-n-reads-learn INTEGER The number of reads to use when training the error model. Smaller numbers will result in a shorter run time but a less reliable error model. [default: 1000000] --p-hashed-feature-ids / --p-no-hashed-feature-ids If true, the feature ids in the resulting table will be presented as hashes of the sequences defining each feature. The hash will always be the same for the same sequence so this allows feature tables to be merged across runs of this method. You should only merge tables if the exact same parameters are used for each run. [default: True] --p-retain-all-samples / --p-no-retain-all-samples If True all samples input to dada2 will be retained in the output of dada2, if false samples with zero total frequency are removed from the table. [default: True] Outputs: --o-table ARTIFACT FeatureTable[Frequency] The resulting feature table. [required] --o-representative-sequences ARTIFACT FeatureData[Sequence] The resulting feature sequences. Each feature in the feature table will be represented by exactly one sequence, and these sequences will be the joined paired-end sequences. [required] --o-denoising-stats ARTIFACT SampleData[DADA2Stats] [required] Miscellaneous: --output-dir PATH Output unspecified results to a directory --verbose / --quiet Display verbose output to stdout and/or stderr during execution of this action. Or silence output if execution is successful (silence is golden). --example-data PATH Write example data and exit. --citations Show citations and exit. --use-cache DIRECTORY Specify the cache to be used for the intermediate work of this action. If not provided, the default cache under $TMP/qiime2/will be used. IMPORTANT FOR HPC USERS: If you are on an HPC system and are using parallel execution it is important to set this to a location that is globally accessible to all nodes in the cluster. --help Show this message and exit. Examples: # ### example: denoise paired qiime dada2 denoise-paired \ --i-demultiplexed-seqs demux-paired.qza \ --p-trunc-len-f 150 \ --p-trunc-len-r 140 \ --o-representative-sequences representative-sequences.qza \ --o-table table.qza \ --o-denoising-stats denoising-stats.qza
Import:
from qiime2.plugins.dada2.methods import denoise_paired
Docstring:
Denoise and dereplicate paired-end sequences This method denoises paired-end sequences, dereplicates them, and filters chimeras. Parameters ---------- demultiplexed_seqs : SampleData[PairedEndSequencesWithQuality] The paired-end demultiplexed sequences to be denoised. trunc_len_f : Int Position at which forward read sequences should be truncated due to decrease in quality. This truncates the 3' end of the of the input sequences, which will be the bases that were sequenced in the last cycles. Reads that are shorter than this value will be discarded. After this parameter is applied there must still be at least a 12 nucleotide overlap between the forward and reverse reads. If 0 is provided, no truncation or length filtering will be performed trunc_len_r : Int Position at which reverse read sequences should be truncated due to decrease in quality. This truncates the 3' end of the of the input sequences, which will be the bases that were sequenced in the last cycles. Reads that are shorter than this value will be discarded. After this parameter is applied there must still be at least a 12 nucleotide overlap between the forward and reverse reads. If 0 is provided, no truncation or length filtering will be performed trim_left_f : Int, optional Position at which forward read sequences should be trimmed due to low quality. This trims the 5' end of the input sequences, which will be the bases that were sequenced in the first cycles. trim_left_r : Int, optional Position at which reverse read sequences should be trimmed due to low quality. This trims the 5' end of the input sequences, which will be the bases that were sequenced in the first cycles. max_ee_f : Float, optional Forward reads with number of expected errors higher than this value will be discarded. max_ee_r : Float, optional Reverse reads with number of expected errors higher than this value will be discarded. trunc_q : Int, optional Reads are truncated at the first instance of a quality score less than or equal to this value. If the resulting read is then shorter than `trunc_len_f` or `trunc_len_r` (depending on the direction of the read) it is discarded. min_overlap : Int % Range(4, None), optional The minimum length of the overlap required for merging the forward and reverse reads. pooling_method : Str % Choices('independent', 'pseudo'), optional The method used to pool samples for denoising. "independent": Samples are denoised indpendently. "pseudo": The pseudo-pooling method is used to approximate pooling of samples. In short, samples are denoised independently once, ASVs detected in at least 2 samples are recorded, and samples are denoised independently a second time, but this time with prior knowledge of the recorded ASVs and thus higher sensitivity to those ASVs. chimera_method : Str % Choices('consensus', 'none', 'pooled'), optional The method used to remove chimeras. "none": No chimera removal is performed. "pooled": All reads are pooled prior to chimera detection. "consensus": Chimeras are detected in samples individually, and sequences found chimeric in a sufficient fraction of samples are removed. min_fold_parent_over_abundance : Float, optional The minimum abundance of potential parents of a sequence being tested as chimeric, expressed as a fold-change versus the abundance of the sequence being tested. Values should be greater than or equal to 1 (i.e. parents should be more abundant than the sequence being tested). This parameter has no effect if chimera_method is "none". allow_one_off : Bool, optional Bimeras that are one-off from exact are also identified if the `allow_one_off` argument is TrueIf True, a sequence will be identified as bimera if it is one mismatch or indel away from an exact bimera. n_threads : Threads, optional The number of threads to use for multithreaded processing. If 0 is provided, all available cores will be used. n_reads_learn : Int, optional The number of reads to use when training the error model. Smaller numbers will result in a shorter run time but a less reliable error model. hashed_feature_ids : Bool, optional If true, the feature ids in the resulting table will be presented as hashes of the sequences defining each feature. The hash will always be the same for the same sequence so this allows feature tables to be merged across runs of this method. You should only merge tables if the exact same parameters are used for each run. retain_all_samples : Bool, optional If True all samples input to dada2 will be retained in the output of dada2, if false samples with zero total frequency are removed from the table. Returns ------- table : FeatureTable[Frequency] The resulting feature table. representative_sequences : FeatureData[Sequence] The resulting feature sequences. Each feature in the feature table will be represented by exactly one sequence, and these sequences will be the joined paired-end sequences. denoising_stats : SampleData[DADA2Stats]