Docstring:
Usage: qiime quality-control exclude-seqs [OPTIONS]
This method aligns feature sequences to a set of reference sequences to
identify sequences that hit/miss the reference within a specified
perc_identity, evalue, and perc_query_aligned. This method could be used to
define a positive filter, e.g., extract only feature sequences that align to
a certain clade of bacteria; or to define a negative filter, e.g., identify
sequences that align to contaminant or human DNA sequences that should be
excluded from subsequent analyses. Note that filtering is performed based on
the perc_identity, perc_query_aligned, and evalue thresholds (the latter
only if method==BLAST and an evalue is set). Set perc_identity==0 and/or
perc_query_aligned==0 to disable these filtering thresholds as necessary.
Inputs:
--i-query-sequences ARTIFACT FeatureData[Sequence]
Sequences to test for exclusion [required]
--i-reference-sequences ARTIFACT FeatureData[Sequence]
Reference sequences to align against feature
sequences [required]
Parameters:
--p-method VALUE Str % Choices('blast', 'blastn-short')¹ | Str %
Choices('vsearch')² Alignment method to use for matching feature
sequences against reference sequences
[default: 'blast']
--p-perc-identity PROPORTION Range(0.0, 1.0, inclusive_end=True)
Reject match if percent identity to reference is
lower. Must be in range [0.0, 1.0] [default: 0.97]
--p-evalue NUMBER BLAST expectation (E) value threshold for saving
hits. Reject if E value is higher than threshold.
This threshold is disabled by default. [optional]
--p-perc-query-aligned NUMBER
Percent of query sequence that must align to
reference in order to be accepted as a hit.
[default: 0.97]
--p-threads NTHREADS Number of threads to use. Only applies to vsearch
method. [default: 1]
--p-left-justify VALUE Bool % Choices(False)¹ | Bool²
Reject match if the pairwise alignment begins with
gaps [default: False]
Outputs:
--o-sequence-hits ARTIFACT FeatureData[Sequence]
Subset of feature sequences that align to reference
sequences [required]
--o-sequence-misses ARTIFACT FeatureData[Sequence]
Subset of feature sequences that do not align to
reference sequences [required]
Miscellaneous:
--output-dir PATH Output unspecified results to a directory
--verbose / --quiet Display verbose output to stdout and/or stderr
during execution of this action. Or silence output
if execution is successful (silence is golden).
--example-data PATH Write example data and exit.
--citations Show citations and exit.
--use-cache DIRECTORY Specify the cache to be used for the intermediate
work of this action. If not provided, the default
cache under $TMP/qiime2/ will be used.
IMPORTANT FOR HPC USERS: If you are on an HPC system
and are using parallel execution it is important to
set this to a location that is globally accessible
to all nodes in the cluster.
--help Show this message and exit.
Import:
from qiime2.plugins.quality_control.methods import exclude_seqs
Docstring:
Exclude sequences by alignment
This method aligns feature sequences to a set of reference sequences to
identify sequences that hit/miss the reference within a specified
perc_identity, evalue, and perc_query_aligned. This method could be used to
define a positive filter, e.g., extract only feature sequences that align
to a certain clade of bacteria; or to define a negative filter, e.g.,
identify sequences that align to contaminant or human DNA sequences that
should be excluded from subsequent analyses. Note that filtering is
performed based on the perc_identity, perc_query_aligned, and evalue
thresholds (the latter only if method==BLAST and an evalue is set). Set
perc_identity==0 and/or perc_query_aligned==0 to disable these filtering
thresholds as necessary.
Parameters
----------
query_sequences : FeatureData[Sequence]
Sequences to test for exclusion
reference_sequences : FeatureData[Sequence]
Reference sequences to align against feature sequences
method : Str % Choices('blast', 'blastn-short')¹ | Str % Choices('vsearch')², optional
Alignment method to use for matching feature sequences against
reference sequences
perc_identity : Float % Range(0.0, 1.0, inclusive_end=True), optional
Reject match if percent identity to reference is lower. Must be in
range [0.0, 1.0]
evalue : Float, optional
BLAST expectation (E) value threshold for saving hits. Reject if E
value is higher than threshold. This threshold is disabled by default.
perc_query_aligned : Float, optional
Percent of query sequence that must align to reference in order to be
accepted as a hit.
threads : Threads, optional
Number of threads to use. Only applies to vsearch method.
left_justify : Bool % Choices(False)¹ | Bool², optional
Reject match if the pairwise alignment begins with gaps
Returns
-------
sequence_hits : FeatureData[Sequence]
Subset of feature sequences that align to reference sequences
sequence_misses : FeatureData[Sequence]
Subset of feature sequences that do not align to reference sequences