Hematopoietic cell transplantation data#

This tutorial focuses on data reused from Liao et al (2021) Compilation of longitudinal microbiota data and hospitalome from hematopoietic cell transplantation patients (Liao et al. [LTC+21]).

We thank the study participants for the contribution of their valuable samples while undergoing cancer treatment, and we thank Liao et al. [LTC+21] for their considerable efforts to make these data accessible to the cancer microbiome research community.

Any work that uses these data should cite Liao et al. [LTC+21] and the original studies (which are all cited in Liao et al. [LTC+21]). Our analyses will primarily focus on the samples collected for Reconstitution of the gut microbiota of antibiotic-treated patients by autologous fecal microbiota transplant (Taur et al. [TCS+18]).

Structure of the tutorial#

This tutorial is split into two parts:

  1. The upstream tutorial covers steps up to the generation of the feature table, which tallys the frequency of amplicon sequence variants (ASV) on a per-sample basis, and feature data which lists the sequence that defines each ASV in the feature table.

  2. The downstream tutorial begins with a feature table and feature data and constitutes the analysis and interpretation of that information. We’ll spend the majority of the week on the downstream tutorial.

The two parts of this tutorial are both dervived from the same data set Liao et al. [LTC+21]. The upstream tutorial uses a relatively small number of samples (n=41) and is designed to allow us to work through the most computationally expensive steps of the analysis quickly, so you can get experience running these steps. By working with fewer samples, these steps can be run in just a few minutes.

The downstream tutorial uses the complete feature table and feature data published in FigShare by Liao et al. [LTC+21]. Since that data set contains many more samples (n=12,546) and over 550,000,000 sequences, it would be possible very time-consumming to run the upstream steps on this data interactively. We will show how to load that data in QIIME 2, and do some filtering of the full data to focus our work on specific samples of interest. In our case, we’ll work with the Taur et al. [TCS+18] samples. However, the full dataset will be available for you to filter in other ways, and to experiment with on your own. As the authors note: These microbiota data, combined with the curated clinical metadata presented here, can serve as powerful hypothesis generators for microbiome studies.