Docstring:
Usage: qiime rescript cull-seqs [OPTIONS]
Filter DNA or RNA sequences that contain ambiguous bases and homopolymers,
and output filtered DNA sequences. Removes DNA sequences that have the
specified number, or more, of IUPAC compliant degenerate bases. Remaining
sequences are removed if they contain homopolymers equal to or longer than
the specified length. If the input consists of RNA sequences, they are
reverse transcribed to DNA before filtering.
Inputs:
--i-sequences ARTIFACT FeatureData[Sequence | RNASequence]
DNA or RNA Sequences to be screened for removal
based on degenerate base and homopolymer screening
criteria. [required]
Parameters:
--p-num-degenerates INTEGER
Range(1, None) Sequences with N, or more, degenerate bases will be
removed. [default: 5]
--p-homopolymer-length INTEGER
Range(2, None) Sequences containing a homopolymer sequence of
length N, or greater, will be removed. [default: 8]
--p-n-jobs INTEGER Number of concurrent processes to use while
Range(1, None) processing sequences. More is faster but typically
should not be higher than the number of available
CPUs. Output sequence order may change when using
multiple jobs. [default: 1]
Outputs:
--o-clean-sequences ARTIFACT FeatureData[Sequence]
The resulting DNA sequences that pass degenerate
base and homopolymer screening criteria. [required]
Miscellaneous:
--output-dir PATH Output unspecified results to a directory
--verbose / --quiet Display verbose output to stdout and/or stderr
during execution of this action. Or silence output
if execution is successful (silence is golden).
--example-data PATH Write example data and exit.
--citations Show citations and exit.
--use-cache DIRECTORY Specify the cache to be used for the intermediate
work of this action. If not provided, the default
cache under $TMP/qiime2/ will be used.
IMPORTANT FOR HPC USERS: If you are on an HPC system
and are using parallel execution it is important to
set this to a location that is globally accessible
to all nodes in the cluster.
--help Show this message and exit.
Import:
from qiime2.plugins.rescript.methods import cull_seqs
Docstring:
Removes sequences that contain at least the specified number of degenerate
bases and/or homopolymers of a given length.
Filter DNA or RNA sequences that contain ambiguous bases and homopolymers,
and output filtered DNA sequences. Removes DNA sequences that have the
specified number, or more, of IUPAC compliant degenerate bases. Remaining
sequences are removed if they contain homopolymers equal to or longer than
the specified length. If the input consists of RNA sequences, they are
reverse transcribed to DNA before filtering.
Parameters
----------
sequences : FeatureData[Sequence | RNASequence]
DNA or RNA Sequences to be screened for removal based on degenerate
base and homopolymer screening criteria.
num_degenerates : Int % Range(1, None), optional
Sequences with N, or more, degenerate bases will be removed.
homopolymer_length : Int % Range(2, None), optional
Sequences containing a homopolymer sequence of length N, or greater,
will be removed.
n_jobs : Int % Range(1, None), optional
Number of concurrent processes to use while processing sequences. More
is faster but typically should not be higher than the number of
available CPUs. Output sequence order may change when using multiple
jobs.
Returns
-------
clean_sequences : FeatureData[Sequence]
The resulting DNA sequences that pass degenerate base and homopolymer
screening criteria.