Alpha diversity visualizations#
Alpha group significance#
First we’ll look for general patterns, by comparing different categorical groupings of samples to see if there is some relationship to richness.
To start with, we’ll examine ‘observed features’:
alpha_group_sig_obs_feats_viz, = diversity_actions.alpha_group_significance(
alpha_diversity=observed_features_vector,
metadata=sample_metadata_md,
)
action_results <- diversity_actions$alpha_group_significance(
alpha_diversity=observed_features_vector,
metadata=sample_metadata_md,
)
alpha_group_sig_obs_feats_viz <- action_results$visualization
qiime diversity alpha-group-significance \
--i-alpha-diversity diversity-core-metrics-phylogenetic/observed_features_vector.qza \
--m-metadata-file sample-metadata.tsv \
--o-visualization alpha-group-sig-obs-feats.qzv
use.action(
use.UsageAction('diversity', 'alpha_group_significance'),
use.UsageInputs(
alpha_diversity=core_metrics_results.observed_features_vector,
metadata=sample_metadata),
use.UsageOutputNames(visualization='alpha_group_sig_obs_feats')
)
- Using the
qiime2 diversity alpha-group-significance
tool: Set “alpha_diversity” to
#: qiime2 diversity core-metrics-phylogenetic [...] : observed_features_vector.qza
For “metadata”:
Perform the following steps.
Leave as
Metadata from TSV
Set “Metadata Source” to
sample-metadata.tsv
Press the
Execute
button.
- Once completed, for the new entry in your history, use the
Edit
button to set the name as follows: (Renaming is optional, but it will make any subsequent steps easier to complete.)
History Name
“Name” to set (be sure to press
Save
)#: qiime2 diversity alpha-group-significance [...] : visualization.qzv
alpha-group-sig-obs-feats.qzv
The first thing to notice is the high variability in each individual’s richness
(PatientID
). The centers and spreads of the individual distributions are
likely to obscure other effects, so we will want to keep this in mind.
Additionally, we have repeated measures of each individual, so we are
violating independence assumptions when looking at other categories.
(Kruskal-Wallis is a non-parameteric test, but like most tests, still requires
samples to be independent.)
Keeping in mind that other categories are probably inconclusive, we notice that there are (amusingly, and somewhat reassuringly) differences in stool consistency (solid vs non-solid).
Because these data were derived from a study in which participants recieved auto-fecal microbiota transplant, we may also be interested in whether there was a difference in richness between the control group and the auto-FMT goup.
Looking at autoFmtGroup
we see that there is no apparent difference, but
we also know that we are violating independence with our repeated measures, and
all patients recieved a bone-marrow transplant which may be a stronger effect.
(The goal of the auto-FMT was to mitigate the impact of the marrow transplant.)
We will use a more advanced statistical model to explore this question.
Linear Mixed Effects#
In order to manage the repeated measures, we will use a linear
mixed-effects model. In a mixed-effects model, we combine fixed-effects (your
typical linear regression coefficients) with random-effects. These random
effects are some (ostensibly random) per-group coefficient which minimizes the
error within that group. In our situation, we would want our random effect to
be the PatientID
as we can see each subject has a different baseline for
richness (and we have multiple measures for each patient).
By making that a random effect, we can more accurately ascribe associations to
the fixed effects as we treat each sample as a “draw” from a per-group
distribution.
There are several ways to create a linear model with random effects, but we will be using a random-intercept, which allows for the per-subject intercept to take on a different average from the population intercept (modeling what we saw in the group-significance plot above).
First let’s evaluate the general trend of the transplant.
import qiime2.plugins.longitudinal.actions as longitudinal_actions
md_observed_features_md = observed_features_vector.view(Metadata)
merged_observed_features_md = sample_metadata_md.merge(md_observed_features_md)
lme_obs_features_HCT_viz, = longitudinal_actions.linear_mixed_effects(
metadata=merged_observed_features_md,
state_column='DayRelativeToNearestHCT',
individual_id_column='PatientID',
metric='observed_features',
)
longitudinal_actions <- import("qiime2.plugins.longitudinal.actions")
md_observed_features_md <- observed_features_vector$view(Metadata)
merged_observed_features_md <- sample_metadata_md$merge(md_observed_features_md)
action_results <- longitudinal_actions$linear_mixed_effects(
metadata=merged_observed_features_md,
state_column='DayRelativeToNearestHCT',
individual_id_column='PatientID',
metric='observed_features',
)
lme_obs_features_HCT_viz <- action_results$visualization
qiime longitudinal linear-mixed-effects \
--m-metadata-file sample-metadata.tsv diversity-core-metrics-phylogenetic/observed_features_vector.qza \
--p-state-column DayRelativeToNearestHCT \
--p-individual-id-column PatientID \
--p-metric observed_features \
--o-visualization lme-obs-features-HCT.qzv
md_observed_features = use.view_as_metadata(
'md_observed_features',
core_metrics_results.observed_features_vector)
merged_observed_features = use.merge_metadata(
'merged_observed_features',
sample_metadata,
md_observed_features)
use.action(
use.UsageAction('longitudinal', 'linear_mixed_effects'),
use.UsageInputs(metadata=merged_observed_features,
state_column='DayRelativeToNearestHCT',
individual_id_column='PatientID',
metric='observed_features'),
use.UsageOutputNames(visualization='lme-obs-features-HCT')
)
- Using the
qiime2 longitudinal linear-mixed-effects
tool: For “metadata”:
Perform the following steps.
Leave as
Metadata from TSV
Set “Metadata Source” to
sample-metadata.tsv
Press the
+ Insert metadata
button to set up the next steps.Change to
Metadata from Artifact
Set “Metadata Source” to
qiime2 diversity core-metrics-phylogenetic [...] : observed_features_vector.qza
Set “state_column” to
DayRelativeToNearestHCT
Set “individual_id_column” to
PatientID
Expand the
additional options
sectionSet “metric” to
observed_features
Press the
Execute
button.
- Once completed, for the new entry in your history, use the
Edit
button to set the name as follows: (Renaming is optional, but it will make any subsequent steps easier to complete.)
History Name
“Name” to set (be sure to press
Save
)#: qiime2 longitudinal linear-mixed-effects [...] : visualization.qzv
lme-obs-features-HCT.qzv
Here we see a significant association between richness and the bone marrow transplant.
We may also be interested in the effect of the auto fecal microbiota transplant. It should be known that these are generally correlated, so choosing one model over the other will require external knowledge.
lme_obs_features_FMT_viz, = longitudinal_actions.linear_mixed_effects(
metadata=merged_observed_features_md,
state_column='day-relative-to-fmt',
individual_id_column='PatientID',
metric='observed_features',
)
action_results <- longitudinal_actions$linear_mixed_effects(
metadata=merged_observed_features_md,
state_column='day-relative-to-fmt',
individual_id_column='PatientID',
metric='observed_features',
)
lme_obs_features_FMT_viz <- action_results$visualization
qiime longitudinal linear-mixed-effects \
--m-metadata-file sample-metadata.tsv diversity-core-metrics-phylogenetic/observed_features_vector.qza \
--p-state-column day-relative-to-fmt \
--p-individual-id-column PatientID \
--p-metric observed_features \
--o-visualization lme-obs-features-FMT.qzv
use.action(
use.UsageAction('longitudinal', 'linear_mixed_effects'),
use.UsageInputs(metadata=merged_observed_features,
state_column='day-relative-to-fmt',
individual_id_column='PatientID',
metric='observed_features'),
use.UsageOutputNames(visualization='lme-obs-features-FMT')
)
- Using the
qiime2 longitudinal linear-mixed-effects
tool: For “metadata”:
Perform the following steps.
Leave as
Metadata from TSV
Set “Metadata Source” to
sample-metadata.tsv
Press the
+ Insert metadata
button to set up the next steps.Change to
Metadata from Artifact
Set “Metadata Source” to
qiime2 diversity core-metrics-phylogenetic [...] : observed_features_vector.qza
Set “state_column” to
day-relative-to-fmt
Set “individual_id_column” to
PatientID
Expand the
additional options
sectionSet “metric” to
observed_features
Press the
Execute
button.
- Once completed, for the new entry in your history, use the
Edit
button to set the name as follows: (Renaming is optional, but it will make any subsequent steps easier to complete.)
History Name
“Name” to set (be sure to press
Save
)#: qiime2 longitudinal linear-mixed-effects [...] : visualization.qzv
lme-obs-features-FMT.qzv
We also see a downward trend from the FMT. Since the goal of the FMT was to ameliorate the impact of the bone marrow transplant protocol (which involves an extreme course of antibiotics) on gut health, and the timing of the FMT is related to the timing of the marrow transplant, we might deduce that the negative coefficient is primarily related to the bone marrow transplant procedure. (We can’t prove this with statistics alone however, in this case, we are using abductive reasoning).
Looking at the log-likelihood, we also note that the HCT result is slightly better than the FMT in accounting for the loss of richness. But only slightly, if we were to perform model testing it may not prove significant.
In any case, we can ask a more targeted question to identify if the FMT was useful in recovering richness.
By adding the autoFmtGroup
to our linear model, we can see if there
are different slopes for the two groups, based on an interaction term.
lme_obs_features_treatmentVScontrol_viz, = longitudinal_actions.linear_mixed_effects(
metadata=merged_observed_features_md,
state_column='day-relative-to-fmt',
group_columns='autoFmtGroup',
individual_id_column='PatientID',
metric='observed_features',
)
action_results <- longitudinal_actions$linear_mixed_effects(
metadata=merged_observed_features_md,
state_column='day-relative-to-fmt',
group_columns='autoFmtGroup',
individual_id_column='PatientID',
metric='observed_features',
)
lme_obs_features_treatmentVScontrol_viz <- action_results$visualization
qiime longitudinal linear-mixed-effects \
--m-metadata-file sample-metadata.tsv diversity-core-metrics-phylogenetic/observed_features_vector.qza \
--p-state-column day-relative-to-fmt \
--p-group-columns autoFmtGroup \
--p-individual-id-column PatientID \
--p-metric observed_features \
--o-visualization lme-obs-features-treatmentVScontrol.qzv
use.action(
use.UsageAction('longitudinal', 'linear_mixed_effects'),
use.UsageInputs(metadata=merged_observed_features,
state_column='day-relative-to-fmt',
group_columns='autoFmtGroup',
individual_id_column='PatientID',
metric='observed_features'),
use.UsageOutputNames(visualization='lme-obs-features-treatmentVScontrol')
)
- Using the
qiime2 longitudinal linear-mixed-effects
tool: For “metadata”:
Perform the following steps.
Leave as
Metadata from TSV
Set “Metadata Source” to
sample-metadata.tsv
Press the
+ Insert metadata
button to set up the next steps.Change to
Metadata from Artifact
Set “Metadata Source” to
qiime2 diversity core-metrics-phylogenetic [...] : observed_features_vector.qza
Set “state_column” to
day-relative-to-fmt
Set “individual_id_column” to
PatientID
Expand the
additional options
sectionSet “metric” to
observed_features
Set “group_columns” to
autoFmtGroup
Press the
Execute
button.
- Once completed, for the new entry in your history, use the
Edit
button to set the name as follows: (Renaming is optional, but it will make any subsequent steps easier to complete.)
History Name
“Name” to set (be sure to press
Save
)#: qiime2 longitudinal linear-mixed-effects [...] : visualization.qzv
lme-obs-features-treatmentVScontrol.qzv
Here we see that the autoFmtGroup
is not on its own a significant predictor
of richness, but its interaction term with Q('day-relative-to-fmt')
is.
This implies that there are different slopes between these groups, and we note
that given the coding of Q('day-relative-to-fmt'):autoFmtGroup[T.treatment]
we have a positive coefficient which counteracts (to a degree) the negative
coefficient of Q('day-relative-to-fmt')
.
Warning
The slopes of the regression scatterplot in this visualization are not the same fit as our mixed-effects model in the table. They are a naive OLS fit to give a sense of the situation. A proper visualization would be a partial regression plot which can condition on other terms to show some effect in “isolation”. This is not currently implemented in QIIME 2.
How would you test the above models with a different diversity index, such as Faith’s Phylogenetic Diversity?
Solution to
Group Significance:
alpha_group_sig_faith_pd_viz, = diversity_actions.alpha_group_significance(
alpha_diversity=faith_pd_vector,
metadata=sample_metadata_md,
)
action_results <- diversity_actions$alpha_group_significance(
alpha_diversity=faith_pd_vector,
metadata=sample_metadata_md,
)
alpha_group_sig_faith_pd_viz <- action_results$visualization
qiime diversity alpha-group-significance \
--i-alpha-diversity diversity-core-metrics-phylogenetic/faith_pd_vector.qza \
--m-metadata-file sample-metadata.tsv \
--o-visualization alpha-group-sig-faith-pd.qzv
use.action(
use.UsageAction('diversity', 'alpha_group_significance'),
use.UsageInputs(
alpha_diversity=core_metrics_results.faith_pd_vector,
metadata=sample_metadata),
use.UsageOutputNames(visualization='alpha_group_sig_faith_pd')
)
- Using the
qiime2 diversity alpha-group-significance
tool: Set “alpha_diversity” to
#: qiime2 diversity core-metrics-phylogenetic [...] : faith_pd_vector.qza
For “metadata”:
Perform the following steps.
Leave as
Metadata from TSV
Set “Metadata Source” to
sample-metadata.tsv
Press the
Execute
button.
- Once completed, for the new entry in your history, use the
Edit
button to set the name as follows: (Renaming is optional, but it will make any subsequent steps easier to complete.)
History Name
“Name” to set (be sure to press
Save
)#: qiime2 diversity alpha-group-significance [...] : visualization.qzv
alpha-group-sig-faith-pd.qzv
Linear-Mixed-Effects:
md_faith_pd_md = faith_pd_vector.view(Metadata)
merged_faith_pd_md = sample_metadata_md.merge(md_faith_pd_md)
lme_faith_pd_treatmentVScontrol_viz, = longitudinal_actions.linear_mixed_effects(
metadata=merged_faith_pd_md,
state_column='day-relative-to-fmt',
group_columns='autoFmtGroup',
individual_id_column='PatientID',
metric='faith_pd',
)
md_faith_pd_md <- faith_pd_vector$view(Metadata)
merged_faith_pd_md <- sample_metadata_md$merge(md_faith_pd_md)
action_results <- longitudinal_actions$linear_mixed_effects(
metadata=merged_faith_pd_md,
state_column='day-relative-to-fmt',
group_columns='autoFmtGroup',
individual_id_column='PatientID',
metric='faith_pd',
)
lme_faith_pd_treatmentVScontrol_viz <- action_results$visualization
qiime longitudinal linear-mixed-effects \
--m-metadata-file sample-metadata.tsv diversity-core-metrics-phylogenetic/faith_pd_vector.qza \
--p-state-column day-relative-to-fmt \
--p-group-columns autoFmtGroup \
--p-individual-id-column PatientID \
--p-metric faith_pd \
--o-visualization lme-faith-pd-treatmentVScontrol.qzv
md_faith_pd = use.view_as_metadata(
'md_faith_pd',
core_metrics_results.faith_pd_vector)
merged_faith_pd = use.merge_metadata(
'merged_faith_pd',
sample_metadata,
md_faith_pd)
use.action(
use.UsageAction('longitudinal', 'linear_mixed_effects'),
use.UsageInputs(metadata=merged_faith_pd,
state_column='day-relative-to-fmt',
group_columns='autoFmtGroup',
individual_id_column='PatientID',
metric='faith_pd'),
use.UsageOutputNames(visualization='lme_faith_pd_treatmentVScontrol')
)
- Using the
qiime2 longitudinal linear-mixed-effects
tool: For “metadata”:
Perform the following steps.
Leave as
Metadata from TSV
Set “Metadata Source” to
sample-metadata.tsv
Press the
+ Insert metadata
button to set up the next steps.Change to
Metadata from Artifact
Set “Metadata Source” to
qiime2 diversity core-metrics-phylogenetic [...] : faith_pd_vector.qza
Set “state_column” to
day-relative-to-fmt
Set “individual_id_column” to
PatientID
Expand the
additional options
sectionSet “metric” to
faith_pd
Set “group_columns” to
autoFmtGroup
Press the
Execute
button.
- Once completed, for the new entry in your history, use the
Edit
button to set the name as follows: (Renaming is optional, but it will make any subsequent steps easier to complete.)
History Name
“Name” to set (be sure to press
Save
)#: qiime2 longitudinal linear-mixed-effects [...] : visualization.qzv
lme-faith-pd-treatmentVScontrol.qzv