Warning
This site has been replaced by the new QIIME 2 “amplicon distribution” documentation, as of the 2025.4 release of QIIME 2. You can still access the content from the “old docs” here for the QIIME 2 2024.10 and earlier releases, but we recommend that you transition to the new documentation at https://amplicon-docs.qiime2.org. Content on this site is no longer updated and may be out of date.
Are you looking for:
the QIIME 2 homepage? That’s https://qiime2.org.
learning resources for microbiome marker gene (i.e., amplicon) analysis? See the QIIME 2 amplicon distribution documentation.
learning resources for microbiome metagenome analysis? See the MOSHPIT documentation.
installation instructions, plugins, books, videos, workshops, or resources? See the QIIME 2 Library.
general help? See the QIIME 2 Forum.
Old content beyond this point… 👴👵
exclude-seqs: Exclude sequences by alignment¶
Citations |
|
---|
Docstring:
Usage: qiime quality-control exclude-seqs [OPTIONS] This method aligns feature sequences to a set of reference sequences to identify sequences that hit/miss the reference within a specified perc_identity, evalue, and perc_query_aligned. This method could be used to define a positive filter, e.g., extract only feature sequences that align to a certain clade of bacteria; or to define a negative filter, e.g., identify sequences that align to contaminant or human DNA sequences that should be excluded from subsequent analyses. Note that filtering is performed based on the perc_identity, perc_query_aligned, and evalue thresholds (the latter only if method==BLAST and an evalue is set). Set perc_identity==0 and/or perc_query_aligned==0 to disable these filtering thresholds as necessary. Inputs: --i-query-sequences ARTIFACT FeatureData[Sequence] Sequences to test for exclusion [required] --i-reference-sequences ARTIFACT FeatureData[Sequence] Reference sequences to align against feature sequences [required] Parameters: --p-method VALUE Str % Choices('blast', 'blastn-short')¹ | Str % Choices('vsearch')² Alignment method to use for matching feature sequences against reference sequences [default: 'blast'] --p-perc-identity PROPORTION Range(0.0, 1.0, inclusive_end=True) Reject match if percent identity to reference is lower. Must be in range [0.0, 1.0] [default: 0.97] --p-evalue NUMBER BLAST expectation (E) value threshold for saving hits. Reject if E value is higher than threshold. This threshold is disabled by default. [optional] --p-perc-query-aligned NUMBER Percent of query sequence that must align to reference in order to be accepted as a hit. [default: 0.97] --p-threads NTHREADS Number of threads to use. Only applies to vsearch method. [default: 1] --p-left-justify VALUE Bool % Choices(False)¹ | Bool² Reject match if the pairwise alignment begins with gaps [default: False] Outputs: --o-sequence-hits ARTIFACT FeatureData[Sequence] Subset of feature sequences that align to reference sequences [required] --o-sequence-misses ARTIFACT FeatureData[Sequence] Subset of feature sequences that do not align to reference sequences [required] Miscellaneous: --output-dir PATH Output unspecified results to a directory --verbose / --quiet Display verbose output to stdout and/or stderr during execution of this action. Or silence output if execution is successful (silence is golden). --example-data PATH Write example data and exit. --citations Show citations and exit. --use-cache DIRECTORY Specify the cache to be used for the intermediate work of this action. If not provided, the default cache under $TMP/qiime2/will be used. IMPORTANT FOR HPC USERS: If you are on an HPC system and are using parallel execution it is important to set this to a location that is globally accessible to all nodes in the cluster. --help Show this message and exit.
Import:
from qiime2.plugins.quality_control.methods import exclude_seqs
Docstring:
Exclude sequences by alignment This method aligns feature sequences to a set of reference sequences to identify sequences that hit/miss the reference within a specified perc_identity, evalue, and perc_query_aligned. This method could be used to define a positive filter, e.g., extract only feature sequences that align to a certain clade of bacteria; or to define a negative filter, e.g., identify sequences that align to contaminant or human DNA sequences that should be excluded from subsequent analyses. Note that filtering is performed based on the perc_identity, perc_query_aligned, and evalue thresholds (the latter only if method==BLAST and an evalue is set). Set perc_identity==0 and/or perc_query_aligned==0 to disable these filtering thresholds as necessary. Parameters ---------- query_sequences : FeatureData[Sequence] Sequences to test for exclusion reference_sequences : FeatureData[Sequence] Reference sequences to align against feature sequences method : Str % Choices('blast', 'blastn-short')¹ | Str % Choices('vsearch')², optional Alignment method to use for matching feature sequences against reference sequences perc_identity : Float % Range(0.0, 1.0, inclusive_end=True), optional Reject match if percent identity to reference is lower. Must be in range [0.0, 1.0] evalue : Float, optional BLAST expectation (E) value threshold for saving hits. Reject if E value is higher than threshold. This threshold is disabled by default. perc_query_aligned : Float, optional Percent of query sequence that must align to reference in order to be accepted as a hit. threads : Threads, optional Number of threads to use. Only applies to vsearch method. left_justify : Bool % Choices(False)¹ | Bool², optional Reject match if the pairwise alignment begins with gaps Returns ------- sequence_hits : FeatureData[Sequence] Subset of feature sequences that align to reference sequences sequence_misses : FeatureData[Sequence] Subset of feature sequences that do not align to reference sequences