Docstring:
Usage: qiime rescript filter-seqs-length-by-taxon [OPTIONS]
Filter sequences by length. Can filter both globally by minimum and/or
maximum length, and set individual threshold for individual taxonomic groups
(using the "labels" option). Note that filtering can be performed for
multiple taxonomic groups simultaneously, and nested taxonomic filters can
be applied (e.g., to apply a more stringent filter for a particular genus,
but a less stringent filter for other members of the kingdom). For global
length-based filtering without conditional taxonomic filtering, see
filter_seqs_length.
Inputs:
--i-sequences ARTIFACT FeatureData[Sequence]
Sequences to be filtered by length. [required]
--i-taxonomy ARTIFACT FeatureData[Taxonomy]
Taxonomic classifications of sequences to be
filtered. [required]
Parameters:
--p-labels TEXT... One or more taxonomic labels to use for conditional
List[Str] filtering. For example, use this option to set
different min/max filter settings for individual
phyla. Must input the same number of labels as
min-lens and/or max-lens. If a sequence matches
multiple taxonomic labels, this method will apply
the most stringent threshold(s): the longest minimum
length and/or the shortest maximum length that is
associated with the matching labels. [required]
--p-min-lens INTEGERS...
Range(1, None) Minimum length thresholds to use for filtering
sequences associated with each label. If any
min-lens are specified, must have the same number of
min-lens as labels. Sequences that contain this
label in their taxonomy will be removed if they are
less than the specified length. [optional]
--p-max-lens INTEGERS...
Range(1, None) Maximum length thresholds to use for filtering
sequences associated with each label. If any
max-lens are specified, must have the same number of
max-lens as labels. Sequences that contain this
label in their taxonomy will be removed if they are
more than the specified length. [optional]
--p-global-min INTEGER The minimum length threshold for filtering all
Range(1, None) sequences. Any sequence shorter than this length
will be removed. [optional]
--p-global-max INTEGER The maximum length threshold for filtering all
Range(1, None) sequences. Any sequence longer than this length will
be removed. [optional]
Outputs:
--o-filtered-seqs ARTIFACT FeatureData[Sequence]
Sequences that pass the filtering thresholds.
[required]
--o-discarded-seqs ARTIFACT FeatureData[Sequence]
Sequences that fall outside the filtering
thresholds. [required]
Miscellaneous:
--output-dir PATH Output unspecified results to a directory
--verbose / --quiet Display verbose output to stdout and/or stderr
during execution of this action. Or silence output
if execution is successful (silence is golden).
--example-data PATH Write example data and exit.
--citations Show citations and exit.
--use-cache DIRECTORY Specify the cache to be used for the intermediate
work of this action. If not provided, the default
cache under $TMP/qiime2/ will be used.
IMPORTANT FOR HPC USERS: If you are on an HPC system
and are using parallel execution it is important to
set this to a location that is globally accessible
to all nodes in the cluster.
--help Show this message and exit.
Import:
from qiime2.plugins.rescript.methods import filter_seqs_length_by_taxon
Docstring:
Filter sequences by length and taxonomic group.
Filter sequences by length. Can filter both globally by minimum and/or
maximum length, and set individual threshold for individual taxonomic
groups (using the "labels" option). Note that filtering can be performed
for multiple taxonomic groups simultaneously, and nested taxonomic filters
can be applied (e.g., to apply a more stringent filter for a particular
genus, but a less stringent filter for other members of the kingdom). For
global length-based filtering without conditional taxonomic filtering, see
filter_seqs_length.
Parameters
----------
sequences : FeatureData[Sequence]
Sequences to be filtered by length.
taxonomy : FeatureData[Taxonomy]
Taxonomic classifications of sequences to be filtered.
labels : List[Str]
One or more taxonomic labels to use for conditional filtering. For
example, use this option to set different min/max filter settings for
individual phyla. Must input the same number of labels as min_lens
and/or max_lens. If a sequence matches multiple taxonomic labels, this
method will apply the most stringent threshold(s): the longest minimum
length and/or the shortest maximum length that is associated with the
matching labels.
min_lens : List[Int % Range(1, None)], optional
Minimum length thresholds to use for filtering sequences associated
with each label. If any min_lens are specified, must have the same
number of min_lens as labels. Sequences that contain this label in
their taxonomy will be removed if they are less than the specified
length.
max_lens : List[Int % Range(1, None)], optional
Maximum length thresholds to use for filtering sequences associated
with each label. If any max_lens are specified, must have the same
number of max_lens as labels. Sequences that contain this label in
their taxonomy will be removed if they are more than the specified
length.
global_min : Int % Range(1, None), optional
The minimum length threshold for filtering all sequences. Any sequence
shorter than this length will be removed.
global_max : Int % Range(1, None), optional
The maximum length threshold for filtering all sequences. Any sequence
longer than this length will be removed.
Returns
-------
filtered_seqs : FeatureData[Sequence]
Sequences that pass the filtering thresholds.
discarded_seqs : FeatureData[Sequence]
Sequences that fall outside the filtering thresholds.