Docstring:
Usage: qiime quality-control evaluate-composition [OPTIONS]
This visualizer compares the feature composition of pairs of observed and
expected samples containing the same sample ID in two separate feature
tables. Typically, feature composition will consist of taxonomy
classifications or other semicolon-delimited feature annotations. Taxon
accuracy rate, taxon detection rate, and linear regression scores between
expected and observed observations are calculated at each semicolon-
delimited rank, and plots of per-level accuracy and observation correlations
are plotted. A histogram of distance between false positive observations and
the nearest expected feature is also generated, where distance equals the
number of rank differences between the observed feature and the nearest
common lineage in the expected feature. This visualizer is most suitable for
testing per-run data quality on sequencing runs that contain mock
communities or other samples with known composition. Also suitable for
sanity checks of bioinformatics pipeline performance.
Inputs:
--i-expected-features ARTIFACT FeatureTable[RelativeFrequency]
Expected feature compositions [required]
--i-observed-features ARTIFACT FeatureTable[RelativeFrequency]
Observed feature compositions [required]
Parameters:
--p-depth INTEGER Maximum depth of semicolon-delimited taxonomic ranks
to test (e.g., 1 = root, 7 = species for the
greengenes reference sequence database). [default: 7]
--p-palette TEXT Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2',
'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c',
'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow')
Color palette to utilize for plotting.
[default: 'Set1']
--p-plot-tar / --p-no-plot-tar
Plot taxon accuracy rate (TAR) on score plot. TAR is
the number of true positive features divided by the
total number of observed features (TAR = true
positives / (true positives + false positives)).
[default: True]
--p-plot-tdr / --p-no-plot-tdr
Plot taxon detection rate (TDR) on score plot. TDR
is the number of true positive features divided by
the total number of expected features (TDR = true
positives / (true positives + false negatives)).
[default: True]
--p-plot-r-value / --p-no-plot-r-value
Plot expected vs. observed linear regression r value
on score plot. [default: False]
--p-plot-r-squared / --p-no-plot-r-squared
Plot expected vs. observed linear regression
r-squared value on score plot. [default: True]
--p-plot-bray-curtis / --p-no-plot-bray-curtis
Plot expected vs. observed Bray-Curtis dissimilarity
scores on score plot. [default: False]
--p-plot-jaccard / --p-no-plot-jaccard
Plot expected vs. observed Jaccard distances scores
on score plot. [default: False]
--p-plot-observed-features / --p-no-plot-observed-features
Plot observed features count on score plot.
[default: False]
--p-plot-observed-features-ratio / --p-no-plot-observed-features-ratio
Plot ratio of observed:expected features on score
plot. [default: True]
--m-metadata-file METADATA
--m-metadata-column COLUMN MetadataColumn[Categorical]
Optional sample metadata that maps observed-features
sample IDs to expected-features sample IDs.
[optional]
Outputs:
--o-visualization VISUALIZATION
[required]
Miscellaneous:
--output-dir PATH Output unspecified results to a directory
--verbose / --quiet Display verbose output to stdout and/or stderr
during execution of this action. Or silence output if
execution is successful (silence is golden).
--example-data PATH Write example data and exit.
--citations Show citations and exit.
--use-cache DIRECTORY Specify the cache to be used for the intermediate
work of this action. If not provided, the default
cache under $TMP/qiime2/ will be used.
IMPORTANT FOR HPC USERS: If you are on an HPC system
and are using parallel execution it is important to
set this to a location that is globally accessible to
all nodes in the cluster.
--help Show this message and exit.
Import:
from qiime2.plugins.quality_control.visualizers import evaluate_composition
Docstring:
Evaluate expected vs. observed taxonomic composition of samples
This visualizer compares the feature composition of pairs of observed and
expected samples containing the same sample ID in two separate feature
tables. Typically, feature composition will consist of taxonomy
classifications or other semicolon-delimited feature annotations. Taxon
accuracy rate, taxon detection rate, and linear regression scores between
expected and observed observations are calculated at each semicolon-
delimited rank, and plots of per-level accuracy and observation
correlations are plotted. A histogram of distance between false positive
observations and the nearest expected feature is also generated, where
distance equals the number of rank differences between the observed feature
and the nearest common lineage in the expected feature. This visualizer is
most suitable for testing per-run data quality on sequencing runs that
contain mock communities or other samples with known composition. Also
suitable for sanity checks of bioinformatics pipeline performance.
Parameters
----------
expected_features : FeatureTable[RelativeFrequency]
Expected feature compositions
observed_features : FeatureTable[RelativeFrequency]
Observed feature compositions
depth : Int, optional
Maximum depth of semicolon-delimited taxonomic ranks to test (e.g., 1 =
root, 7 = species for the greengenes reference sequence database).
palette : Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow'), optional
Color palette to utilize for plotting.
plot_tar : Bool, optional
Plot taxon accuracy rate (TAR) on score plot. TAR is the number of true
positive features divided by the total number of observed features (TAR
= true positives / (true positives + false positives)).
plot_tdr : Bool, optional
Plot taxon detection rate (TDR) on score plot. TDR is the number of
true positive features divided by the total number of expected features
(TDR = true positives / (true positives + false negatives)).
plot_r_value : Bool, optional
Plot expected vs. observed linear regression r value on score plot.
plot_r_squared : Bool, optional
Plot expected vs. observed linear regression r-squared value on score
plot.
plot_bray_curtis : Bool, optional
Plot expected vs. observed Bray-Curtis dissimilarity scores on score
plot.
plot_jaccard : Bool, optional
Plot expected vs. observed Jaccard distances scores on score plot.
plot_observed_features : Bool, optional
Plot observed features count on score plot.
plot_observed_features_ratio : Bool, optional
Plot ratio of observed:expected features on score plot.
metadata : MetadataColumn[Categorical], optional
Optional sample metadata that maps observed_features sample IDs to
expected_features sample IDs.
Returns
-------
visualization : Visualization