Fork me on GitHub

classify-consensus-blast: BLAST+ consensus taxonomy classifier

Citations
  • Christiam Camacho, George Coulouris, Vahram Avagyan, Ning Ma, Jason Papadopoulos, Kevin Bealer, and Thomas L Madden. Blast+: architecture and applications. BMC bioinformatics, 10(1):421, 2009. doi:10.1186/1471-2105-10-421.

Docstring:

Usage: qiime feature-classifier classify-consensus-blast [OPTIONS]

  Assign taxonomy to query sequences using BLAST+. Performs BLAST+ local
  alignment between query and reference_reads, then assigns consensus taxonomy
  to each query sequence from among maxaccepts hits, min_consensus of which
  share that taxonomic assignment. Note that maxaccepts selects the first N
  hits with > perc_identity similarity to query, not the top N matches. For
  top N hits, use classify-consensus-vsearch.

Inputs:
  --i-query ARTIFACT FeatureData[Sequence]
                          Query sequences.                          [required]
  --i-reference-reads ARTIFACT FeatureData[Sequence]
                          Reference sequences.                      [required]
  --i-reference-taxonomy ARTIFACT FeatureData[Taxonomy]
                          reference taxonomy labels.                [required]
Parameters:
  --p-maxaccepts INTEGER  Maximum number of hits to keep for each query.
    Range(1, None)        BLAST will choose the first N hits in the reference
                          database that exceed perc-identity similarity to
                          query. NOTE: the database is not sorted by
                          similarity to query, so these are the first N hits
                          that pass the threshold, not necessarily the top N
                          hits.                                  [default: 10]
  --p-perc-identity PROPORTION Range(0.0, 1.0, inclusive_end=True)
                          Reject match if percent identity to query is lower.
                                                                [default: 0.8]
  --p-query-cov PROPORTION Range(0.0, 1.0, inclusive_end=True)
                          Reject match if query alignment coverage per
                          high-scoring pair is lower. Note: this uses blastn's
                          qcov_hsp_perc parameter, and may not behave
                          identically to the query-cov parameter used by
                          classify-consensus-vsearch.           [default: 0.8]
  --p-strand TEXT Choices('both', 'plus', 'minus')
                          Align against reference sequences in forward
                          ("plus"), reverse ("minus"), or both directions
                          ("both").                          [default: 'both']
  --p-evalue NUMBER       BLAST expectation value (E) threshold for saving
                          hits.                               [default: 0.001]
  --p-output-no-hits / --p-no-output-no-hits
                          Report both matching and non-matching queries.
                          WARNING: always use the default setting for this
                          option unless if you know what you are doing! If you
                          set this option to False, your sequences and feature
                          table will need to be filtered to exclude
                          unclassified sequences, otherwise you may run into
                          errors downstream from missing feature IDs. Set to
                          FALSE to mirror default BLAST search.
                                                               [default: True]
  --p-min-consensus NUMBER Range(0.5, 1.0, inclusive_start=False,
    inclusive_end=True)   Minimum fraction of assignments must match top hit
                          to be accepted as consensus assignment.
                                                               [default: 0.51]
  --p-unassignable-label TEXT
                          Annotation given to sequences without any hits.
                                                       [default: 'Unassigned']
Outputs:
  --o-classification ARTIFACT FeatureData[Taxonomy]
                          Taxonomy classifications of query sequences.
                                                                    [required]
  --o-search-results ARTIFACT
    FeatureData[BLAST6]   Top hits for each query.                  [required]
Miscellaneous:
  --output-dir PATH       Output unspecified results to a directory
  --verbose / --quiet     Display verbose output to stdout and/or stderr
                          during execution of this action. Or silence output
                          if execution is successful (silence is golden).
  --example-data PATH     Write example data and exit.
  --citations             Show citations and exit.
  --help                  Show this message and exit.

Import:

from qiime2.plugins.feature_classifier.pipelines import classify_consensus_blast

Docstring:

BLAST+ consensus taxonomy classifier

Assign taxonomy to query sequences using BLAST+. Performs BLAST+ local
alignment between query and reference_reads, then assigns consensus
taxonomy to each query sequence from among maxaccepts hits, min_consensus
of which share that taxonomic assignment. Note that maxaccepts selects the
first N hits with > perc_identity similarity to query, not the top N
matches. For top N hits, use classify-consensus-vsearch.

Parameters
----------
query : FeatureData[Sequence]
    Query sequences.
reference_reads : FeatureData[Sequence]
    Reference sequences.
reference_taxonomy : FeatureData[Taxonomy]
    reference taxonomy labels.
maxaccepts : Int % Range(1, None), optional
    Maximum number of hits to keep for each query. BLAST will choose the
    first N hits in the reference database that exceed perc_identity
    similarity to query. NOTE: the database is not sorted by similarity to
    query, so these are the first N hits that pass the threshold, not
    necessarily the top N hits.
perc_identity : Float % Range(0.0, 1.0, inclusive_end=True), optional
    Reject match if percent identity to query is lower.
query_cov : Float % Range(0.0, 1.0, inclusive_end=True), optional
    Reject match if query alignment coverage per high-scoring pair is
    lower. Note: this uses blastn's qcov_hsp_perc parameter, and may not
    behave identically to the query_cov parameter used by classify-
    consensus-vsearch.
strand : Str % Choices('both', 'plus', 'minus'), optional
    Align against reference sequences in forward ("plus"), reverse
    ("minus"), or both directions ("both").
evalue : Float, optional
    BLAST expectation value (E) threshold for saving hits.
output_no_hits : Bool, optional
    Report both matching and non-matching queries. WARNING: always use the
    default setting for this option unless if you know what you are doing!
    If you set this option to False, your sequences and feature table will
    need to be filtered to exclude unclassified sequences, otherwise you
    may run into errors downstream from missing feature IDs. Set to FALSE
    to mirror default BLAST search.
min_consensus : Float % Range(0.5, 1.0, inclusive_start=False, inclusive_end=True), optional
    Minimum fraction of assignments must match top hit to be accepted as
    consensus assignment.
unassignable_label : Str, optional
    Annotation given to sequences without any hits.

Returns
-------
classification : FeatureData[Taxonomy]
    Taxonomy classifications of query sequences.
search_results : FeatureData[BLAST6]
    Top hits for each query.