Warning
This site has been replaced by the new QIIME 2 “amplicon distribution” documentation, as of the 2025.4 release of QIIME 2. You can still access the content from the “old docs” here for the QIIME 2 2024.10 and earlier releases, but we recommend that you transition to the new documentation at https://amplicon-docs.qiime2.org. Content on this site is no longer updated and may be out of date.
Are you looking for:
the QIIME 2 homepage? That’s https://qiime2.org.
learning resources for microbiome marker gene (i.e., amplicon) analysis? See the QIIME 2 amplicon distribution documentation.
learning resources for microbiome metagenome analysis? See the MOSHPIT documentation.
installation instructions, plugins, books, videos, workshops, or resources? See the QIIME 2 Library.
general help? See the QIIME 2 Forum.
Old content beyond this point… 👴👵
extract-reads: Extract reads from reference sequences.¶
Docstring:
Usage: qiime feature-classifier extract-reads [OPTIONS]
Extract simulated amplicon reads from a reference database. Performs in-
silico PCR to extract simulated amplicons from reference sequences that
match the input primer sequences (within the mismatch threshold specified by
`identity`). Both primer sequences must be in the 5' -> 3' orientation.
Sequences that fail to match both primers will be excluded. Reads are
extracted, trimmed, and filtered in the following order: 1. reads are
extracted in specified orientation; 2. primers are removed; 3. reads longer
than `max_length` are removed; 4. reads are trimmed with `trim_right`; 5.
reads are truncated to `trunc_len`; 6. reads are trimmed with `trim_left`;
7. reads shorter than `min_length` are removed.
Inputs:
--i-sequences ARTIFACT FeatureData[Sequence]
[required]
Parameters:
--p-f-primer TEXT forward primer sequence (5' -> 3'). [required]
--p-r-primer TEXT reverse primer sequence (5' -> 3'). Do not use
reverse-complemented primer sequence. [required]
--p-trim-right INTEGER trim-right nucleotides are removed from the 3' end
if trim-right is positive. Applied before trunc-len
and trim-left. [default: 0]
--p-trunc-len INTEGER read is cut to trunc-len if trunc-len is positive.
Applied after trim-right but before trim-left.
[default: 0]
--p-trim-left INTEGER trim-left nucleotides are removed from the 5' end
if trim-left is positive. Applied after trim-right
and trunc-len. [default: 0]
--p-identity NUMBER minimum combined primer match identity threshold.
[default: 0.8]
--p-min-length INTEGER Minimum amplicon length. Shorter amplicons are
Range(0, None) discarded. Applied after trimming and truncation, so
be aware that trimming may impact sequence
retention. Set to zero to disable min length
filtering. [default: 50]
--p-max-length INTEGER Maximum amplicon length. Longer amplicons are
Range(0, None) discarded. Applied before trimming and truncation,
so plan accordingly. Set to zero (default) to
disable max length filtering. [default: 0]
--p-n-jobs INTEGER Number of seperate processes to run.
Range(1, None) [default: 1]
--p-batch-size VALUE Int % Range(1, None) | Str % Choices('auto')
Number of sequences to process in a batch. The
`auto` option is calculated from the number of
sequences and number of jobs specified.
[default: 'auto']
--p-read-orientation TEXT Choices('both', 'forward', 'reverse')
Orientation of primers relative to the sequences:
"forward" searches for primer hits in the forward
direction, "reverse" searches reverse-complement,
and "both" searches both directions.
[default: 'both']
Outputs:
--o-reads ARTIFACT FeatureData[Sequence]
[required]
Miscellaneous:
--output-dir PATH Output unspecified results to a directory
--verbose / --quiet Display verbose output to stdout and/or stderr
during execution of this action. Or silence output
if execution is successful (silence is golden).
--example-data PATH Write example data and exit.
--citations Show citations and exit.
--use-cache DIRECTORY Specify the cache to be used for the intermediate
work of this action. If not provided, the default
cache under $TMP/qiime2/ will be used.
IMPORTANT FOR HPC USERS: If you are on an HPC system
and are using parallel execution it is important to
set this to a location that is globally accessible
to all nodes in the cluster.
--help Show this message and exit.
Import:
from qiime2.plugins.feature_classifier.methods import extract_reads
Docstring:
Extract reads from reference sequences.
Extract simulated amplicon reads from a reference database. Performs in-
silico PCR to extract simulated amplicons from reference sequences that
match the input primer sequences (within the mismatch threshold specified
by `identity`). Both primer sequences must be in the 5' -> 3' orientation.
Sequences that fail to match both primers will be excluded. Reads are
extracted, trimmed, and filtered in the following order: 1. reads are
extracted in specified orientation; 2. primers are removed; 3. reads longer
than `max_length` are removed; 4. reads are trimmed with `trim_right`; 5.
reads are truncated to `trunc_len`; 6. reads are trimmed with `trim_left`;
7. reads shorter than `min_length` are removed.
Parameters
----------
sequences : FeatureData[Sequence]
f_primer : Str
forward primer sequence (5' -> 3').
r_primer : Str
reverse primer sequence (5' -> 3'). Do not use reverse-complemented
primer sequence.
trim_right : Int, optional
trim_right nucleotides are removed from the 3' end if trim_right is
positive. Applied before trunc_len and trim_left.
trunc_len : Int, optional
read is cut to trunc_len if trunc_len is positive. Applied after
trim_right but before trim_left.
trim_left : Int, optional
trim_left nucleotides are removed from the 5' end if trim_left is
positive. Applied after trim_right and trunc_len.
identity : Float, optional
minimum combined primer match identity threshold.
min_length : Int % Range(0, None), optional
Minimum amplicon length. Shorter amplicons are discarded. Applied after
trimming and truncation, so be aware that trimming may impact sequence
retention. Set to zero to disable min length filtering.
max_length : Int % Range(0, None), optional
Maximum amplicon length. Longer amplicons are discarded. Applied before
trimming and truncation, so plan accordingly. Set to zero (default) to
disable max length filtering.
n_jobs : Int % Range(1, None), optional
Number of seperate processes to run.
batch_size : Int % Range(1, None) | Str % Choices('auto'), optional
Number of sequences to process in a batch. The `auto` option is
calculated from the number of sequences and number of jobs specified.
read_orientation : Str % Choices('both', 'forward', 'reverse'), optional
Orientation of primers relative to the sequences: "forward" searches
for primer hits in the forward direction, "reverse" searches reverse-
complement, and "both" searches both directions.
Returns
-------
reads : FeatureData[Sequence]