Warning

This site has been replaced by the new QIIME 2 “amplicon distribution” documentation, as of the 2025.4 release of QIIME 2. You can still access the content from the “old docs” here for the QIIME 2 2024.10 and earlier releases, but we recommend that you transition to the new documentation at https://amplicon-docs.qiime2.org. Content on this site is no longer updated and may be out of date.

Are you looking for:

the QIIME 2 homepage? That’s https://qiime2.org.
learning resources for microbiome marker gene (i.e., amplicon) analysis? See the QIIME 2 amplicon distribution documentation.
learning resources for microbiome metagenome analysis? See the MOSHPIT documentation.
installation instructions, plugins, books, videos, workshops, or resources? See the QIIME 2 Library.
general help? See the QIIME 2 Forum.

Old content beyond this point… 👴👵

merge-taxa: Compare taxonomies and select the longest, highest scoring, or find the least common ancestor.¶

Command line interface
Artifact API

Docstring:

Usage: qiime rescript merge-taxa [OPTIONS]

  Compare taxonomy annotations and choose the best one. Can select the longest
  taxonomy annotation, the highest scoring, or the least common ancestor.
  Note: when a tie occurs, the last taxonomy added takes precedent.

Inputs:
  --i-data ARTIFACTS... List[FeatureData[Taxonomy]]
                         Two or more feature taxonomies to be merged.
                                                                    [required]
Parameters:
  --p-mode TEXT Choices('len', 'lca', 'score', 'super', 'majority')
                         How to merge feature taxonomies: "len" will select
                         the taxonomy with the most elements (e.g., species
                         level will beat genus level); "lca" will find the
                         least common ancestor and report this consensus
                         taxonomy; "score" will select the taxonomy with the
                         highest score (e.g., confidence or consensus score).
                         Note that "score" assumes that this score is always
                         contained as the second column in a feature taxonomy
                         dataframe. "majority" finds the LCA consensus while
                         giving preference to majority labels. "super" finds
                         the LCA consensus while giving preference to majority
                         labels and collapsing substrings into superstrings.
                         For example, when a more specific taxonomy does not
                         contradict a less specific taxonomy, the more
                         specific is chosen. That is, "g__Faecalibacterium;
                         s__prausnitzii", will be preferred over
                         "g__Faecalibacterium; s__"           [default: 'len']
  --p-rank-handle-regex TEXT
                         Regular expression indicating which taxonomic rank a
                         label belongs to; this handle is stripped from the
                         label prior to operating on the taxonomy. The net
                         effect is that ambiguous or empty levels can be
                         removed prior to comparison, enabling selection of
                         taxonomies with more complete taxonomic information.
                         For example, "^[dkpcofgs]__" will recognize
                         greengenes or silva rank handles. Note that
                         rank_handles are removed but not replaced; use the
                         new_rank_handle parameter to replace the rank
                         handles.                   [default: '^[dkpcofgs]__']
  --p-new-rank-handles VALUES... List[Str % Choices('disable')] | List[Str
    % Choices('domain', 'superkingdom', 'kingdom', 'subkingdom',
    'superphylum', 'phylum', 'subphylum', 'infraphylum', 'superclass',
    'class', 'subclass', 'infraclass', 'cohort', 'superorder', 'order',
    'suborder', 'infraorder', 'parvorder', 'superfamily', 'family',
    'subfamily', 'tribe', 'subtribe', 'genus', 'subgenus', 'species group',
    'species subgroup', 'species', 'subspecies', 'forma')]
                         Specifies the set of rank handles to prepend to
                         taxonomic labels at each rank. Note that merged
                         taxonomies will only contain as many levels as there
                         are handles if this parameter is used. This will trim
                         all taxonomies to the given levels, even if longer
                         annotations exist. Note that this parameter will
                         prepend rank handles whether or not they already
                         exist in the taxonomy, so should ALWAYS be used in
                         conjunction with `rank-handle-regex` if rank handles
                         exist in any of the inputs. Use 'disable' to prevent
                         prepending 'new-rank-handles'
[default: ['domain', 'phylum', 'class', 'order', 'family', 'genus', 'species']]
  --p-unclassified-label TEXT
                         Specifies what label should be used for taxonomies
                         that could not be resolved (when LCA modes are used).
                                                       [default: 'Unassigned']
Outputs:
  --o-merged-data ARTIFACT FeatureData[Taxonomy]
                                                                    [required]
Miscellaneous:
  --output-dir PATH      Output unspecified results to a directory
  --verbose / --quiet    Display verbose output to stdout and/or stderr
                         during execution of this action. Or silence output if
                         execution is successful (silence is golden).
  --example-data PATH    Write example data and exit.
  --citations            Show citations and exit.
  --use-cache DIRECTORY  Specify the cache to be used for the intermediate
                         work of this action. If not provided, the default
                         cache under $TMP/qiime2/ will be used.
                         IMPORTANT FOR HPC USERS: If you are on an HPC system
                         and are using parallel execution it is important to
                         set this to a location that is globally accessible to
                         all nodes in the cluster.
  --help                 Show this message and exit.

Import:

from qiime2.plugins.rescript.methods import merge_taxa

Docstring:

Compare taxonomies and select the longest, highest scoring, or find the
least common ancestor.

Compare taxonomy annotations and choose the best one. Can select the
longest taxonomy annotation, the highest scoring, or the least common
ancestor. Note: when a tie occurs, the last taxonomy added takes precedent.

Parameters
----------
data : List[FeatureData[Taxonomy]]
Two or more feature taxonomies to be merged.
mode : Str % Choices('len', 'lca', 'score', 'super', 'majority'), optional
How to merge feature taxonomies: "len" will select the taxonomy with
the most elements (e.g., species level will beat genus level); "lca"
will find the least common ancestor and report this consensus taxonomy;
"score" will select the taxonomy with the highest score (e.g.,
confidence or consensus score). Note that "score" assumes that this
score is always contained as the second column in a feature taxonomy
dataframe. "majority" finds the LCA consensus while giving preference
to majority labels. "super" finds the LCA consensus while giving
preference to majority labels and collapsing substrings into
superstrings. For example, when a more specific taxonomy does not
contradict a less specific taxonomy, the more specific is chosen. That
is, "g__Faecalibacterium; s__prausnitzii", will be preferred over
"g__Faecalibacterium; s__"
rank_handle_regex : Str, optional
Regular expression indicating which taxonomic rank a label belongs to;
this handle is stripped from the label prior to operating on the
taxonomy. The net effect is that ambiguous or empty levels can be
removed prior to comparison, enabling selection of taxonomies with more
complete taxonomic information. For example, "^[dkpcofgs]__" will
recognize greengenes or silva rank handles. Note that rank_handles are
removed but not replaced; use the new_rank_handle parameter to replace
the rank handles.
new_rank_handles : List[Str % Choices('disable')] | List[Str % Choices('domain', 'superkingdom', 'kingdom', 'subkingdom', 'superphylum', 'phylum', 'subphylum', 'infraphylum', 'superclass', 'class', 'subclass', 'infraclass', 'cohort', 'superorder', 'order', 'suborder', 'infraorder', 'parvorder', 'superfamily', 'family', 'subfamily', 'tribe', 'subtribe', 'genus', 'subgenus', 'species group', 'species subgroup', 'species', 'subspecies', 'forma')], optional
Specifies the set of rank handles to prepend to taxonomic labels at
each rank. Note that merged taxonomies will only contain as many levels
as there are handles if this parameter is used. This will trim all
taxonomies to the given levels, even if longer annotations exist. Note
that this parameter will prepend rank handles whether or not they
already exist in the taxonomy, so should ALWAYS be used in conjunction
with `rank_handle_regex` if rank handles exist in any of the inputs.
Use 'disable' to prevent prepending 'new_rank_handles'
unclassified_label : Str, optional
Specifies what label should be used for taxonomies that could not be
resolved (when LCA modes are used).

Returns
-------
merged_data : FeatureData[Taxonomy]

merge-taxa: Compare taxonomies and select the longest, highest scoring, or find the least common ancestor.¶

Docstring:

Import:

Docstring:

Table of Contents

Quick search