Docstring:
Usage: qiime dada2 denoise-paired [OPTIONS]
This method denoises paired-end sequences, dereplicates them, and filters
chimeras.
Inputs:
--i-demultiplexed-seqs ARTIFACT SampleData[PairedEndSequencesWithQuality]
The paired-end demultiplexed sequences to be
denoised. [required]
Parameters:
--p-trunc-len-f INTEGER
Position at which forward read sequences should be
truncated due to decrease in quality. This truncates
the 3' end of the of the input sequences, which will
be the bases that were sequenced in the last cycles.
Reads that are shorter than this value will be
discarded. After this parameter is applied there must
still be at least a 12 nucleotide overlap between the
forward and reverse reads. If 0 is provided, no
truncation or length filtering will be performed
[required]
--p-trunc-len-r INTEGER
Position at which reverse read sequences should be
truncated due to decrease in quality. This truncates
the 3' end of the of the input sequences, which will
be the bases that were sequenced in the last cycles.
Reads that are shorter than this value will be
discarded. After this parameter is applied there must
still be at least a 12 nucleotide overlap between the
forward and reverse reads. If 0 is provided, no
truncation or length filtering will be performed
[required]
--p-trim-left-f INTEGER
Position at which forward read sequences should be
trimmed due to low quality. This trims the 5' end of
the input sequences, which will be the bases that
were sequenced in the first cycles. [default: 0]
--p-trim-left-r INTEGER
Position at which reverse read sequences should be
trimmed due to low quality. This trims the 5' end of
the input sequences, which will be the bases that
were sequenced in the first cycles. [default: 0]
--p-max-ee-f NUMBER Forward reads with number of expected errors higher
than this value will be discarded. [default: 2.0]
--p-max-ee-r NUMBER Reverse reads with number of expected errors higher
than this value will be discarded. [default: 2.0]
--p-trunc-q INTEGER Reads are truncated at the first instance of a
quality score less than or equal to this value. If
the resulting read is then shorter than `trunc-len-f`
or `trunc-len-r` (depending on the direction of the
read) it is discarded. [default: 2]
--p-min-overlap INTEGER
Range(4, None) The minimum length of the overlap required for
merging the forward and reverse reads. [default: 12]
--p-pooling-method TEXT Choices('independent', 'pseudo')
The method used to pool samples for denoising.
"independent": Samples are denoised indpendently.
"pseudo": The pseudo-pooling method is used to
approximate pooling of samples. In short, samples are
denoised independently once, ASVs detected in at
least 2 samples are recorded, and samples are
denoised independently a second time, but this time
with prior knowledge of the recorded ASVs and thus
higher sensitivity to those ASVs.
[default: 'independent']
--p-chimera-method TEXT Choices('consensus', 'none', 'pooled')
The method used to remove chimeras. "none": No
chimera removal is performed. "pooled": All reads are
pooled prior to chimera detection. "consensus":
Chimeras are detected in samples individually, and
sequences found chimeric in a sufficient fraction of
samples are removed. [default: 'consensus']
--p-min-fold-parent-over-abundance NUMBER
The minimum abundance of potential parents of a
sequence being tested as chimeric, expressed as a
fold-change versus the abundance of the sequence
being tested. Values should be greater than or equal
to 1 (i.e. parents should be more abundant than the
sequence being tested). This parameter has no effect
if chimera-method is "none". [default: 1.0]
--p-allow-one-off / --p-no-allow-one-off
Bimeras that are one-off from exact are also
identified if the `allow-one-off` argument is TrueIf
True, a sequence will be identified as bimera if it
is one mismatch or indel away from an exact bimera.
[default: False]
--p-n-threads INTEGER The number of threads to use for multithreaded
processing. If 0 is provided, all available cores
will be used. [default: 1]
--p-n-reads-learn INTEGER
The number of reads to use when training the error
model. Smaller numbers will result in a shorter run
time but a less reliable error model.
[default: 1000000]
--p-hashed-feature-ids / --p-no-hashed-feature-ids
If true, the feature ids in the resulting table will
be presented as hashes of the sequences defining each
feature. The hash will always be the same for the
same sequence so this allows feature tables to be
merged across runs of this method. You should only
merge tables if the exact same parameters are used
for each run. [default: True]
Outputs:
--o-table ARTIFACT FeatureTable[Frequency]
The resulting feature table. [required]
--o-representative-sequences ARTIFACT FeatureData[Sequence]
The resulting feature sequences. Each feature in the
feature table will be represented by exactly one
sequence, and these sequences will be the joined
paired-end sequences. [required]
--o-denoising-stats ARTIFACT SampleData[DADA2Stats]
[required]
Miscellaneous:
--output-dir PATH Output unspecified results to a directory
--verbose / --quiet Display verbose output to stdout and/or stderr
during execution of this action. Or silence output if
execution is successful (silence is golden).
--example-data PATH Write example data and exit.
--citations Show citations and exit.
--help Show this message and exit.
Examples:
# ### example: denoise paired
qiime dada2 denoise-paired \
--i-demultiplexed-seqs demux-paired.qza \
--p-trunc-len-f 150 \
--p-trunc-len-r 140 \
--o-representative-sequences representative-sequences.qza \
--o-table table.qza \
--o-denoising-stats denoising-stats.qza
Import:
from qiime2.plugins.dada2.methods import denoise_paired
Docstring:
Denoise and dereplicate paired-end sequences
This method denoises paired-end sequences, dereplicates them, and filters
chimeras.
Parameters
----------
demultiplexed_seqs : SampleData[PairedEndSequencesWithQuality]
The paired-end demultiplexed sequences to be denoised.
trunc_len_f : Int
Position at which forward read sequences should be truncated due to
decrease in quality. This truncates the 3' end of the of the input
sequences, which will be the bases that were sequenced in the last
cycles. Reads that are shorter than this value will be discarded. After
this parameter is applied there must still be at least a 12 nucleotide
overlap between the forward and reverse reads. If 0 is provided, no
truncation or length filtering will be performed
trunc_len_r : Int
Position at which reverse read sequences should be truncated due to
decrease in quality. This truncates the 3' end of the of the input
sequences, which will be the bases that were sequenced in the last
cycles. Reads that are shorter than this value will be discarded. After
this parameter is applied there must still be at least a 12 nucleotide
overlap between the forward and reverse reads. If 0 is provided, no
truncation or length filtering will be performed
trim_left_f : Int, optional
Position at which forward read sequences should be trimmed due to low
quality. This trims the 5' end of the input sequences, which will be
the bases that were sequenced in the first cycles.
trim_left_r : Int, optional
Position at which reverse read sequences should be trimmed due to low
quality. This trims the 5' end of the input sequences, which will be
the bases that were sequenced in the first cycles.
max_ee_f : Float, optional
Forward reads with number of expected errors higher than this value
will be discarded.
max_ee_r : Float, optional
Reverse reads with number of expected errors higher than this value
will be discarded.
trunc_q : Int, optional
Reads are truncated at the first instance of a quality score less than
or equal to this value. If the resulting read is then shorter than
`trunc_len_f` or `trunc_len_r` (depending on the direction of the read)
it is discarded.
min_overlap : Int % Range(4, None), optional
The minimum length of the overlap required for merging the forward and
reverse reads.
pooling_method : Str % Choices('independent', 'pseudo'), optional
The method used to pool samples for denoising. "independent": Samples
are denoised indpendently. "pseudo": The pseudo-pooling method is used
to approximate pooling of samples. In short, samples are denoised
independently once, ASVs detected in at least 2 samples are recorded,
and samples are denoised independently a second time, but this time
with prior knowledge of the recorded ASVs and thus higher sensitivity
to those ASVs.
chimera_method : Str % Choices('consensus', 'none', 'pooled'), optional
The method used to remove chimeras. "none": No chimera removal is
performed. "pooled": All reads are pooled prior to chimera detection.
"consensus": Chimeras are detected in samples individually, and
sequences found chimeric in a sufficient fraction of samples are
removed.
min_fold_parent_over_abundance : Float, optional
The minimum abundance of potential parents of a sequence being tested
as chimeric, expressed as a fold-change versus the abundance of the
sequence being tested. Values should be greater than or equal to 1
(i.e. parents should be more abundant than the sequence being tested).
This parameter has no effect if chimera_method is "none".
allow_one_off : Bool, optional
Bimeras that are one-off from exact are also identified if the
`allow_one_off` argument is TrueIf True, a sequence will be identified
as bimera if it is one mismatch or indel away from an exact bimera.
n_threads : Int, optional
The number of threads to use for multithreaded processing. If 0 is
provided, all available cores will be used.
n_reads_learn : Int, optional
The number of reads to use when training the error model. Smaller
numbers will result in a shorter run time but a less reliable error
model.
hashed_feature_ids : Bool, optional
If true, the feature ids in the resulting table will be presented as
hashes of the sequences defining each feature. The hash will always be
the same for the same sequence so this allows feature tables to be
merged across runs of this method. You should only merge tables if the
exact same parameters are used for each run.
Returns
-------
table : FeatureTable[Frequency]
The resulting feature table.
representative_sequences : FeatureData[Sequence]
The resulting feature sequences. Each feature in the feature table will
be represented by exactly one sequence, and these sequences will be the
joined paired-end sequences.
denoising_stats : SampleData[DADA2Stats]