Warning
This site has been replaced by the new QIIME 2 “amplicon distribution” documentation, as of the 2025.4 release of QIIME 2. You can still access the content from the “old docs” here for the QIIME 2 2024.10 and earlier releases, but we recommend that you transition to the new documentation at https://amplicon-docs.qiime2.org. Content on this site is no longer updated and may be out of date.
Are you looking for:
the QIIME 2 homepage? That’s https://qiime2.org.
learning resources for microbiome marker gene (i.e., amplicon) analysis? See the QIIME 2 amplicon distribution documentation.
learning resources for microbiome metagenome analysis? See the MOSHPIT documentation.
installation instructions, plugins, books, videos, workshops, or resources? See the QIIME 2 Library.
general help? See the QIIME 2 Forum.
Old content beyond this point… 👴👵
filter-seqs-length-by-taxon: Filter sequences by length and taxonomic group.¶
Docstring:
Usage: qiime rescript filter-seqs-length-by-taxon [OPTIONS] Filter sequences by length. Can filter both globally by minimum and/or maximum length, and set individual threshold for individual taxonomic groups (using the "labels" option). Note that filtering can be performed for multiple taxonomic groups simultaneously, and nested taxonomic filters can be applied (e.g., to apply a more stringent filter for a particular genus, but a less stringent filter for other members of the kingdom). For global length-based filtering without conditional taxonomic filtering, see filter_seqs_length. Inputs: --i-sequences ARTIFACT FeatureData[Sequence] Sequences to be filtered by length. [required] --i-taxonomy ARTIFACT FeatureData[Taxonomy] Taxonomic classifications of sequences to be filtered. [required] Parameters: --p-labels TEXT... One or more taxonomic labels to use for conditional List[Str] filtering. For example, use this option to set different min/max filter settings for individual phyla. Must input the same number of labels as min-lens and/or max-lens. If a sequence matches multiple taxonomic labels, this method will apply the most stringent threshold(s): the longest minimum length and/or the shortest maximum length that is associated with the matching labels. [required] --p-min-lens INTEGERS... Range(1, None) Minimum length thresholds to use for filtering sequences associated with each label. If any min-lens are specified, must have the same number of min-lens as labels. Sequences that contain this label in their taxonomy will be removed if they are less than the specified length. [optional] --p-max-lens INTEGERS... Range(1, None) Maximum length thresholds to use for filtering sequences associated with each label. If any max-lens are specified, must have the same number of max-lens as labels. Sequences that contain this label in their taxonomy will be removed if they are more than the specified length. [optional] --p-global-min INTEGER The minimum length threshold for filtering all Range(1, None) sequences. Any sequence shorter than this length will be removed. [optional] --p-global-max INTEGER The maximum length threshold for filtering all Range(1, None) sequences. Any sequence longer than this length will be removed. [optional] Outputs: --o-filtered-seqs ARTIFACT FeatureData[Sequence] Sequences that pass the filtering thresholds. [required] --o-discarded-seqs ARTIFACT FeatureData[Sequence] Sequences that fall outside the filtering thresholds. [required] Miscellaneous: --output-dir PATH Output unspecified results to a directory --verbose / --quiet Display verbose output to stdout and/or stderr during execution of this action. Or silence output if execution is successful (silence is golden). --example-data PATH Write example data and exit. --citations Show citations and exit. --use-cache DIRECTORY Specify the cache to be used for the intermediate work of this action. If not provided, the default cache under $TMP/qiime2/will be used. IMPORTANT FOR HPC USERS: If you are on an HPC system and are using parallel execution it is important to set this to a location that is globally accessible to all nodes in the cluster. --help Show this message and exit.
Import:
from qiime2.plugins.rescript.methods import filter_seqs_length_by_taxon
Docstring:
Filter sequences by length and taxonomic group. Filter sequences by length. Can filter both globally by minimum and/or maximum length, and set individual threshold for individual taxonomic groups (using the "labels" option). Note that filtering can be performed for multiple taxonomic groups simultaneously, and nested taxonomic filters can be applied (e.g., to apply a more stringent filter for a particular genus, but a less stringent filter for other members of the kingdom). For global length-based filtering without conditional taxonomic filtering, see filter_seqs_length. Parameters ---------- sequences : FeatureData[Sequence] Sequences to be filtered by length. taxonomy : FeatureData[Taxonomy] Taxonomic classifications of sequences to be filtered. labels : List[Str] One or more taxonomic labels to use for conditional filtering. For example, use this option to set different min/max filter settings for individual phyla. Must input the same number of labels as min_lens and/or max_lens. If a sequence matches multiple taxonomic labels, this method will apply the most stringent threshold(s): the longest minimum length and/or the shortest maximum length that is associated with the matching labels. min_lens : List[Int % Range(1, None)], optional Minimum length thresholds to use for filtering sequences associated with each label. If any min_lens are specified, must have the same number of min_lens as labels. Sequences that contain this label in their taxonomy will be removed if they are less than the specified length. max_lens : List[Int % Range(1, None)], optional Maximum length thresholds to use for filtering sequences associated with each label. If any max_lens are specified, must have the same number of max_lens as labels. Sequences that contain this label in their taxonomy will be removed if they are more than the specified length. global_min : Int % Range(1, None), optional The minimum length threshold for filtering all sequences. Any sequence shorter than this length will be removed. global_max : Int % Range(1, None), optional The maximum length threshold for filtering all sequences. Any sequence longer than this length will be removed. Returns ------- filtered_seqs : FeatureData[Sequence] Sequences that pass the filtering thresholds. discarded_seqs : FeatureData[Sequence] Sequences that fall outside the filtering thresholds.