Fork me on GitHub

filter-seqs-length-by-taxon: Filter sequences by length and taxonomic group.ΒΆ

Docstring:

Usage: qiime rescript filter-seqs-length-by-taxon [OPTIONS]

  Filter sequences by length. Can filter both globally by minimum and/or
  maximum length, and set individual threshold for individual taxonomic groups
  (using the "labels" option). Note that filtering can be performed for
  multiple taxonomic groups simultaneously, and nested taxonomic filters can
  be applied (e.g., to apply a more stringent filter for a particular genus,
  but a less stringent filter for other members of the kingdom). For global
  length-based filtering without conditional taxonomic filtering, see
  filter_seqs_length.

Inputs:
  --i-sequences ARTIFACT FeatureData[Sequence]
                          Sequences to be filtered by length.       [required]
  --i-taxonomy ARTIFACT FeatureData[Taxonomy]
                          Taxonomic classifications of sequences to be
                          filtered.                                 [required]
Parameters:
  --p-labels TEXT...      One or more taxonomic labels to use for conditional
    List[Str]             filtering. For example, use this option to set
                          different min/max filter settings for individual
                          phyla. Must input the same number of labels as
                          min-lens and/or max-lens. If a sequence matches
                          multiple taxonomic labels, this method will apply
                          the most stringent threshold(s): the longest minimum
                          length and/or the shortest maximum length that is
                          associated with the matching labels.      [required]
  --p-min-lens INTEGERS...
    Range(1, None)        Minimum length thresholds to use for filtering
                          sequences associated with each label. If any
                          min-lens are specified, must have the same number of
                          min-lens as labels. Sequences that contain this
                          label in their taxonomy will be removed if they are
                          less than the specified length.           [optional]
  --p-max-lens INTEGERS...
    Range(1, None)        Maximum length thresholds to use for filtering
                          sequences associated with each label. If any
                          max-lens are specified, must have the same number of
                          max-lens as labels. Sequences that contain this
                          label in their taxonomy will be removed if they are
                          more than the specified length.           [optional]
  --p-global-min INTEGER  The minimum length threshold for filtering all
    Range(1, None)        sequences. Any sequence shorter than this length
                          will be removed.                          [optional]
  --p-global-max INTEGER  The maximum length threshold for filtering all
    Range(1, None)        sequences. Any sequence longer than this length will
                          be removed.                               [optional]
Outputs:
  --o-filtered-seqs ARTIFACT FeatureData[Sequence]
                          Sequences that pass the filtering thresholds.
                                                                    [required]
  --o-discarded-seqs ARTIFACT FeatureData[Sequence]
                          Sequences that fall outside the filtering
                          thresholds.                               [required]
Miscellaneous:
  --output-dir PATH       Output unspecified results to a directory
  --verbose / --quiet     Display verbose output to stdout and/or stderr
                          during execution of this action. Or silence output
                          if execution is successful (silence is golden).
  --example-data PATH     Write example data and exit.
  --citations             Show citations and exit.
  --use-cache DIRECTORY   Specify the cache to be used for the intermediate
                          work of this action. If not provided, the default
                          cache under $TMP/qiime2/ will be used.
                          IMPORTANT FOR HPC USERS: If you are on an HPC system
                          and are using parallel execution it is important to
                          set this to a location that is globally accessible
                          to all nodes in the cluster.
  --help                  Show this message and exit.

Import:

from qiime2.plugins.rescript.methods import filter_seqs_length_by_taxon

Docstring:

Filter sequences by length and taxonomic group.

Filter sequences by length. Can filter both globally by minimum and/or
maximum length, and set individual threshold for individual taxonomic
groups (using the "labels" option). Note that filtering can be performed
for multiple taxonomic groups simultaneously, and nested taxonomic filters
can be applied (e.g., to apply a more stringent filter for a particular
genus, but a less stringent filter for other members of the kingdom). For
global length-based filtering without conditional taxonomic filtering, see
filter_seqs_length.

Parameters
----------
sequences : FeatureData[Sequence]
    Sequences to be filtered by length.
taxonomy : FeatureData[Taxonomy]
    Taxonomic classifications of sequences to be filtered.
labels : List[Str]
    One or more taxonomic labels to use for conditional filtering. For
    example, use this option to set different min/max filter settings for
    individual phyla. Must input the same number of labels as min_lens
    and/or max_lens. If a sequence matches multiple taxonomic labels, this
    method will apply the most stringent threshold(s): the longest minimum
    length and/or the shortest maximum length that is associated with the
    matching labels.
min_lens : List[Int % Range(1, None)], optional
    Minimum length thresholds to use for filtering sequences associated
    with each label. If any min_lens are specified, must have the same
    number of min_lens as labels. Sequences that contain this label in
    their taxonomy will be removed if they are less than the specified
    length.
max_lens : List[Int % Range(1, None)], optional
    Maximum length thresholds to use for filtering sequences associated
    with each label. If any max_lens are specified, must have the same
    number of max_lens as labels. Sequences that contain this label in
    their taxonomy will be removed if they are more than the specified
    length.
global_min : Int % Range(1, None), optional
    The minimum length threshold for filtering all sequences. Any sequence
    shorter than this length will be removed.
global_max : Int % Range(1, None), optional
    The maximum length threshold for filtering all sequences. Any sequence
    longer than this length will be removed.

Returns
-------
filtered_seqs : FeatureData[Sequence]
    Sequences that pass the filtering thresholds.
discarded_seqs : FeatureData[Sequence]
    Sequences that fall outside the filtering thresholds.