Warning
This site has been replaced by the new QIIME 2 “amplicon distribution” documentation, as of the 2025.4 release of QIIME 2. You can still access the content from the “old docs” here for the QIIME 2 2024.10 and earlier releases, but we recommend that you transition to the new documentation at https://amplicon-docs.qiime2.org. Content on this site is no longer updated and may be out of date.
Are you looking for:
the QIIME 2 homepage? That’s https://qiime2.org.
learning resources for microbiome marker gene (i.e., amplicon) analysis? See the QIIME 2 amplicon distribution documentation.
learning resources for microbiome metagenome analysis? See the MOSHPIT documentation.
installation instructions, plugins, books, videos, workshops, or resources? See the QIIME 2 Library.
general help? See the QIIME 2 Forum.
Old content beyond this point… 👴👵
merge-taxa: Compare taxonomies and select the longest, highest scoring, or find the least common ancestor.¶
Docstring:
Usage: qiime rescript merge-taxa [OPTIONS] Compare taxonomy annotations and choose the best one. Can select the longest taxonomy annotation, the highest scoring, or the least common ancestor. Note: when a tie occurs, the last taxonomy added takes precedent. Inputs: --i-data ARTIFACTS... List[FeatureData[Taxonomy]] Two or more feature taxonomies to be merged. [required] Parameters: --p-mode TEXT Choices('len', 'lca', 'score', 'super', 'majority') How to merge feature taxonomies: "len" will select the taxonomy with the most elements (e.g., species level will beat genus level); "lca" will find the least common ancestor and report this consensus taxonomy; "score" will select the taxonomy with the highest score (e.g., confidence or consensus score). Note that "score" assumes that this score is always contained as the second column in a feature taxonomy dataframe. "majority" finds the LCA consensus while giving preference to majority labels. "super" finds the LCA consensus while giving preference to majority labels and collapsing substrings into superstrings. For example, when a more specific taxonomy does not contradict a less specific taxonomy, the more specific is chosen. That is, "g__Faecalibacterium; s__prausnitzii", will be preferred over "g__Faecalibacterium; s__" [default: 'len'] --p-rank-handle-regex TEXT Regular expression indicating which taxonomic rank a label belongs to; this handle is stripped from the label prior to operating on the taxonomy. The net effect is that ambiguous or empty levels can be removed prior to comparison, enabling selection of taxonomies with more complete taxonomic information. For example, "^[dkpcofgs]__" will recognize greengenes or silva rank handles. Note that rank_handles are removed but not replaced; use the new_rank_handle parameter to replace the rank handles. [default: '^[dkpcofgs]__'] --p-new-rank-handles VALUES... List[Str % Choices('disable')] | List[Str % Choices('domain', 'superkingdom', 'kingdom', 'subkingdom', 'superphylum', 'phylum', 'subphylum', 'infraphylum', 'superclass', 'class', 'subclass', 'infraclass', 'cohort', 'superorder', 'order', 'suborder', 'infraorder', 'parvorder', 'superfamily', 'family', 'subfamily', 'tribe', 'subtribe', 'genus', 'subgenus', 'species group', 'species subgroup', 'species', 'subspecies', 'forma')] Specifies the set of rank handles to prepend to taxonomic labels at each rank. Note that merged taxonomies will only contain as many levels as there are handles if this parameter is used. This will trim all taxonomies to the given levels, even if longer annotations exist. Note that this parameter will prepend rank handles whether or not they already exist in the taxonomy, so should ALWAYS be used in conjunction with `rank-handle-regex` if rank handles exist in any of the inputs. Use 'disable' to prevent prepending 'new-rank-handles' [default: ['domain', 'phylum', 'class', 'order', 'family', 'genus', 'species']] --p-unclassified-label TEXT Specifies what label should be used for taxonomies that could not be resolved (when LCA modes are used). [default: 'Unassigned'] Outputs: --o-merged-data ARTIFACT FeatureData[Taxonomy] [required] Miscellaneous: --output-dir PATH Output unspecified results to a directory --verbose / --quiet Display verbose output to stdout and/or stderr during execution of this action. Or silence output if execution is successful (silence is golden). --example-data PATH Write example data and exit. --citations Show citations and exit. --use-cache DIRECTORY Specify the cache to be used for the intermediate work of this action. If not provided, the default cache under $TMP/qiime2/will be used. IMPORTANT FOR HPC USERS: If you are on an HPC system and are using parallel execution it is important to set this to a location that is globally accessible to all nodes in the cluster. --help Show this message and exit.
Import:
from qiime2.plugins.rescript.methods import merge_taxa
Docstring:
Compare taxonomies and select the longest, highest scoring, or find the least common ancestor. Compare taxonomy annotations and choose the best one. Can select the longest taxonomy annotation, the highest scoring, or the least common ancestor. Note: when a tie occurs, the last taxonomy added takes precedent. Parameters ---------- data : List[FeatureData[Taxonomy]] Two or more feature taxonomies to be merged. mode : Str % Choices('len', 'lca', 'score', 'super', 'majority'), optional How to merge feature taxonomies: "len" will select the taxonomy with the most elements (e.g., species level will beat genus level); "lca" will find the least common ancestor and report this consensus taxonomy; "score" will select the taxonomy with the highest score (e.g., confidence or consensus score). Note that "score" assumes that this score is always contained as the second column in a feature taxonomy dataframe. "majority" finds the LCA consensus while giving preference to majority labels. "super" finds the LCA consensus while giving preference to majority labels and collapsing substrings into superstrings. For example, when a more specific taxonomy does not contradict a less specific taxonomy, the more specific is chosen. That is, "g__Faecalibacterium; s__prausnitzii", will be preferred over "g__Faecalibacterium; s__" rank_handle_regex : Str, optional Regular expression indicating which taxonomic rank a label belongs to; this handle is stripped from the label prior to operating on the taxonomy. The net effect is that ambiguous or empty levels can be removed prior to comparison, enabling selection of taxonomies with more complete taxonomic information. For example, "^[dkpcofgs]__" will recognize greengenes or silva rank handles. Note that rank_handles are removed but not replaced; use the new_rank_handle parameter to replace the rank handles. new_rank_handles : List[Str % Choices('disable')] | List[Str % Choices('domain', 'superkingdom', 'kingdom', 'subkingdom', 'superphylum', 'phylum', 'subphylum', 'infraphylum', 'superclass', 'class', 'subclass', 'infraclass', 'cohort', 'superorder', 'order', 'suborder', 'infraorder', 'parvorder', 'superfamily', 'family', 'subfamily', 'tribe', 'subtribe', 'genus', 'subgenus', 'species group', 'species subgroup', 'species', 'subspecies', 'forma')], optional Specifies the set of rank handles to prepend to taxonomic labels at each rank. Note that merged taxonomies will only contain as many levels as there are handles if this parameter is used. This will trim all taxonomies to the given levels, even if longer annotations exist. Note that this parameter will prepend rank handles whether or not they already exist in the taxonomy, so should ALWAYS be used in conjunction with `rank_handle_regex` if rank handles exist in any of the inputs. Use 'disable' to prevent prepending 'new_rank_handles' unclassified_label : Str, optional Specifies what label should be used for taxonomies that could not be resolved (when LCA modes are used). Returns ------- merged_data : FeatureData[Taxonomy]