Warning
This site has been replaced by the new QIIME 2 “amplicon distribution” documentation, as of the 2025.4 release of QIIME 2. You can still access the content from the “old docs” here for the QIIME 2 2024.10 and earlier releases, but we recommend that you transition to the new documentation at https://amplicon-docs.qiime2.org. Content on this site is no longer updated and may be out of date.
Are you looking for:
the QIIME 2 homepage? That’s https://qiime2.org.
learning resources for microbiome marker gene (i.e., amplicon) analysis? See the QIIME 2 amplicon distribution documentation.
learning resources for microbiome metagenome analysis? See the MOSHPIT documentation.
installation instructions, plugins, books, videos, workshops, or resources? See the QIIME 2 Library.
general help? See the QIIME 2 Forum.
Old content beyond this point… 👴👵
evaluate-composition: Evaluate expected vs. observed taxonomic composition of samples¶
Citations |
|
---|
Docstring:
Usage: qiime quality-control evaluate-composition [OPTIONS] This visualizer compares the feature composition of pairs of observed and expected samples containing the same sample ID in two separate feature tables. Typically, feature composition will consist of taxonomy classifications or other semicolon-delimited feature annotations. Taxon accuracy rate, taxon detection rate, and linear regression scores between expected and observed observations are calculated at each semicolon- delimited rank, and plots of per-level accuracy and observation correlations are plotted. A histogram of distance between false positive observations and the nearest expected feature is also generated, where distance equals the number of rank differences between the observed feature and the nearest common lineage in the expected feature. This visualizer is most suitable for testing per-run data quality on sequencing runs that contain mock communities or other samples with known composition. Also suitable for sanity checks of bioinformatics pipeline performance. Inputs: --i-expected-features ARTIFACT FeatureTable[RelativeFrequency] Expected feature compositions [required] --i-observed-features ARTIFACT FeatureTable[RelativeFrequency] Observed feature compositions [required] Parameters: --p-depth INTEGER Maximum depth of semicolon-delimited taxonomic ranks to test (e.g., 1 = root, 7 = species for the greengenes reference sequence database). [default: 7] --p-palette TEXT Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow') Color palette to utilize for plotting. [default: 'Set1'] --p-plot-tar / --p-no-plot-tar Plot taxon accuracy rate (TAR) on score plot. TAR is the number of true positive features divided by the total number of observed features (TAR = true positives / (true positives + false positives)). [default: True] --p-plot-tdr / --p-no-plot-tdr Plot taxon detection rate (TDR) on score plot. TDR is the number of true positive features divided by the total number of expected features (TDR = true positives / (true positives + false negatives)). [default: True] --p-plot-r-value / --p-no-plot-r-value Plot expected vs. observed linear regression r value on score plot. [default: False] --p-plot-r-squared / --p-no-plot-r-squared Plot expected vs. observed linear regression r-squared value on score plot. [default: True] --p-plot-bray-curtis / --p-no-plot-bray-curtis Plot expected vs. observed Bray-Curtis dissimilarity scores on score plot. [default: False] --p-plot-jaccard / --p-no-plot-jaccard Plot expected vs. observed Jaccard distances scores on score plot. [default: False] --p-plot-observed-features / --p-no-plot-observed-features Plot observed features count on score plot. [default: False] --p-plot-observed-features-ratio / --p-no-plot-observed-features-ratio Plot ratio of observed:expected features on score plot. [default: True] --m-metadata-file METADATA --m-metadata-column COLUMN MetadataColumn[Categorical] Optional sample metadata that maps observed-features sample IDs to expected-features sample IDs. [optional] Outputs: --o-visualization VISUALIZATION [required] Miscellaneous: --output-dir PATH Output unspecified results to a directory --verbose / --quiet Display verbose output to stdout and/or stderr during execution of this action. Or silence output if execution is successful (silence is golden). --example-data PATH Write example data and exit. --citations Show citations and exit. --use-cache DIRECTORY Specify the cache to be used for the intermediate work of this action. If not provided, the default cache under $TMP/qiime2/will be used. IMPORTANT FOR HPC USERS: If you are on an HPC system and are using parallel execution it is important to set this to a location that is globally accessible to all nodes in the cluster. --help Show this message and exit.
Import:
from qiime2.plugins.quality_control.visualizers import evaluate_composition
Docstring:
Evaluate expected vs. observed taxonomic composition of samples This visualizer compares the feature composition of pairs of observed and expected samples containing the same sample ID in two separate feature tables. Typically, feature composition will consist of taxonomy classifications or other semicolon-delimited feature annotations. Taxon accuracy rate, taxon detection rate, and linear regression scores between expected and observed observations are calculated at each semicolon- delimited rank, and plots of per-level accuracy and observation correlations are plotted. A histogram of distance between false positive observations and the nearest expected feature is also generated, where distance equals the number of rank differences between the observed feature and the nearest common lineage in the expected feature. This visualizer is most suitable for testing per-run data quality on sequencing runs that contain mock communities or other samples with known composition. Also suitable for sanity checks of bioinformatics pipeline performance. Parameters ---------- expected_features : FeatureTable[RelativeFrequency] Expected feature compositions observed_features : FeatureTable[RelativeFrequency] Observed feature compositions depth : Int, optional Maximum depth of semicolon-delimited taxonomic ranks to test (e.g., 1 = root, 7 = species for the greengenes reference sequence database). palette : Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow'), optional Color palette to utilize for plotting. plot_tar : Bool, optional Plot taxon accuracy rate (TAR) on score plot. TAR is the number of true positive features divided by the total number of observed features (TAR = true positives / (true positives + false positives)). plot_tdr : Bool, optional Plot taxon detection rate (TDR) on score plot. TDR is the number of true positive features divided by the total number of expected features (TDR = true positives / (true positives + false negatives)). plot_r_value : Bool, optional Plot expected vs. observed linear regression r value on score plot. plot_r_squared : Bool, optional Plot expected vs. observed linear regression r-squared value on score plot. plot_bray_curtis : Bool, optional Plot expected vs. observed Bray-Curtis dissimilarity scores on score plot. plot_jaccard : Bool, optional Plot expected vs. observed Jaccard distances scores on score plot. plot_observed_features : Bool, optional Plot observed features count on score plot. plot_observed_features_ratio : Bool, optional Plot ratio of observed:expected features on score plot. metadata : MetadataColumn[Categorical], optional Optional sample metadata that maps observed_features sample IDs to expected_features sample IDs. Returns ------- visualization : Visualization