Fork me on GitHub

Utilities in QIIME 2

There are many non-plugin-based utilities available in QIIME 2. The following document attempts to demonstrate many of these functions. This document is divided by interface, and attempts to cross-reference similar functionality available in other interfaces.

q2cli

Most of the interesting utilities can be found in the tools subcommand of q2cli:

qiime tools --help

stdout:

Usage: qiime tools [OPTIONS] COMMAND [ARGS]...

  Tools for working with QIIME 2 files.

Options:
  --help      Show this message and exit.

Commands:
  cast-metadata     Designate metadata column types.
  citations         Print citations for a QIIME 2 result.
  export            Export data from a QIIME 2 Artifact or a Visualization
  extract           Extract a QIIME 2 Artifact or Visualization archive.
  import            Import data into a new QIIME 2 Artifact.
  inspect-metadata  Inspect columns available in metadata.
  peek              Take a peek at a QIIME 2 Artifact or Visualization.
  validate          Validate data in a QIIME 2 Artifact.
  view              View a QIIME 2 Visualization.

Let’s get our hands on some data so that we can learn more about this functionality! First, we will take a look at the taxonomic bar charts from the PD Mice Tutorial:

Please select a download option that is most appropriate for your environment:
wget \
  -O "taxa-barplot.qzv" \
  "https://data.qiime2.org/2021.11/tutorials/utilities/taxa-barplot.qzv"
curl -sL \
  "https://data.qiime2.org/2021.11/tutorials/utilities/taxa-barplot.qzv" > \
  "taxa-barplot.qzv"

Retrieving Citations

Now that we have some results, let’s learn more about the citations relevant to the creation of this visualization. First, we can check the help text for the qiime tools citations command:

qiime tools citations --help

stdout:

Usage: qiime tools citations [OPTIONS] ARTIFACT/VISUALIZATION

  Print citations as a BibTex file (.bib) for a QIIME 2 result.

Options:
  --help      Show this message and exit.

Output visualizations:

Now that we know how to use the command, we will run the following:

qiime tools citations taxa-barplot.qzv

stdout:

@article{framework|qiime2:2019.10.0|0,
 author = {Bolyen, Evan and Rideout, Jai Ram and Dillon, Matthew R. and Bokulich, Nicholas A. and Abnet, Christian C. and Al-Ghalith, Gabriel A. and Alexander, Harriet and Alm, Eric J. and Arumugam, Manimozhiyan and Asnicar, Francesco and Bai, Yang and Bisanz, Jordan E. and Bittinger, Kyle and Brejnrod, Asker and Brislawn, Colin J. and Brown, C. Titus and Callahan, Benjamin J. and Caraballo-Rodríguez, Andrés Mauricio and Chase, John and Cope, Emily K. and Da Silva, Ricardo and Diener, Christian and Dorrestein, Pieter C. and Douglas, Gavin M. and Durall, Daniel M. and Duvallet, Claire and Edwardson, Christian F. and Ernst, Madeleine and Estaki, Mehrbod and Fouquier, Jennifer and Gauglitz, Julia M. and Gibbons, Sean M. and Gibson, Deanna L. and Gonzalez, Antonio and Gorlick, Kestrel and Guo, Jiarong and Hillmann, Benjamin and Holmes, Susan and Holste, Hannes and Huttenhower, Curtis and Huttley, Gavin A. and Janssen, Stefan and Jarmusch, Alan K. and Jiang, Lingjing and Kaehler, Benjamin D. and Kang, Kyo Bin and Keefe, Christopher R. and Keim, Paul and Kelley, Scott T. and Knights, Dan and Koester, Irina and Kosciolek, Tomasz and Kreps, Jorden and Langille, Morgan G. I. and Lee, Joslynn and Ley, Ruth and Liu, Yong-Xin and Loftfield, Erikka and Lozupone, Catherine and Maher, Massoud and Marotz, Clarisse and Martin, Bryan D. and McDonald, Daniel and McIver, Lauren J. and Melnik, Alexey V. and Metcalf, Jessica L. and Morgan, Sydney C. and Morton, Jamie T. and Naimey, Ahmad Turan and Navas-Molina, Jose A. and Nothias, Louis Felix and Orchanian, Stephanie B. and Pearson, Talima and Peoples, Samuel L. and Petras, Daniel and Preuss, Mary Lai and Pruesse, Elmar and Rasmussen, Lasse Buur and Rivers, Adam and Robeson, Michael S. and Rosenthal, Patrick and Segata, Nicola and Shaffer, Michael and Shiffer, Arron and Sinha, Rashmi and Song, Se Jin and Spear, John R. and Swafford, Austin D. and Thompson, Luke R. and Torres, Pedro J. and Trinh, Pauline and Tripathi, Anupriya and Turnbaugh, Peter J. and Ul-Hasan, Sabah and van der Hooft, Justin J. J. and Vargas, Fernando and Vázquez-Baeza, Yoshiki and Vogtmann, Emily and von Hippel, Max and Walters, William and Wan, Yunhu and Wang, Mingxun and Warren, Jonathan and Weber, Kyle C. and Williamson, Charles H. D. and Willis, Amy D. and Xu, Zhenjiang Zech and Zaneveld, Jesse R. and Zhang, Yilong and Zhu, Qiyun and Knight, Rob and Caporaso, J. Gregory},
 doi = {10.1038/s41587-019-0209-9},
 issn = {1546-1696},
 journal = {Nature Biotechnology},
 number = {8},
 pages = {852-857},
 title = {Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2},
 url = {https://doi.org/10.1038/s41587-019-0209-9},
 volume = {37},
 year = {2019}
}

@article{view|types:2019.10.0|BIOMV210DirFmt|0,
 author = {McDonald, Daniel and Clemente, Jose C and Kuczynski, Justin and Rideout, Jai Ram and Stombaugh, Jesse and Wendel, Doug and Wilke, Andreas and Huse, Susan and Hufnagle, John and Meyer, Folker and Knight, Rob and Caporaso, J Gregory},
 doi = {10.1186/2047-217X-1-7},
 journal = {GigaScience},
 number = {1},
 pages = {7},
 publisher = {BioMed Central},
 title = {The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome},
 volume = {1},
 year = {2012}
}

@inproceedings{view|types:2019.10.0|pandas.core.frame:DataFrame|0,
 author = { Wes McKinney },
 booktitle = { Proceedings of the 9th Python in Science Conference },
 editor = { Stéfan van der Walt and Jarrod Millman },
 pages = { 51 -- 56 },
 title = { Data Structures for Statistical Computing in Python },
 year = { 2010 }
}

@inproceedings{view|types:2019.10.0|pandas.core.series:Series|0,
 author = { Wes McKinney },
 booktitle = { Proceedings of the 9th Python in Science Conference },
 editor = { Stéfan van der Walt and Jarrod Millman },
 pages = { 51 -- 56 },
 title = { Data Structures for Statistical Computing in Python },
 year = { 2010 }
}

@article{plugin|dada2:2019.10.0|0,
 author = {Callahan, Benjamin J and McMurdie, Paul J and Rosen, Michael J and Han, Andrew W and Johnson, Amy Jo A and Holmes, Susan P},
 doi = {10.1038/nmeth.3869},
 journal = {Nature methods},
 number = {7},
 pages = {581},
 publisher = {Nature Publishing Group},
 title = {DADA2: high-resolution sample inference from Illumina amplicon data},
 volume = {13},
 year = {2016}
}

@article{framework|qiime2:2019.4.0|0,
 author = {Bolyen, Evan and Rideout, Jai Ram and Dillon, Matthew R and Bokulich, Nicholas A and Abnet, Christian and Al-Ghalith, Gabriel A and Alexander, Harriet and Alm, Eric J and Arumugam, Manimozhiyan and Asnicar, Francesco and Bai, Yang and Bisanz, Jordan E and Bittinger, Kyle and Brejnrod, Asker and Brislawn, Colin J and Brown, C Titus and Callahan, Benjamin J and Caraballo-Rodríguez, Andrés Mauricio and Chase, John and Cope, Emily and Da Silva, Ricardo and Dorrestein, Pieter C and Douglas, Gavin M and Durall, Daniel M and Duvallet, Claire and Edwardson, Christian F and Ernst, Madeleine and Estaki, Mehrbod and Fouquier, Jennifer and Gauglitz, Julia M and Gibson, Deanna L and Gonzalez, Antonio and Gorlick, Kestrel and Guo, Jiarong and Hillmann, Benjamin and Holmes, Susan and Holste, Hannes and Huttenhower, Curtis and Huttley, Gavin and Janssen, Stefan and Jarmusch, Alan K and Jiang, Lingjing and Kaehler, Benjamin and Kang, Kyo Bin and Keefe, Christopher R and Keim, Paul and Kelley, Scott T and Knights, Dan and Koester, Irina and Kosciolek, Tomasz and Kreps, Jorden and Langille, Morgan GI and Lee, Joslynn and Ley, Ruth and Liu, Yong-Xin and Loftfield, Erikka and Lozupone, Catherine and Maher, Massoud and Marotz, Clarisse and Martin, Bryan and McDonald, Daniel and McIver, Lauren J and Melnik, Alexey V and Metcalf, Jessica L and Morgan, Sydney C and Morton, Jamie and Naimey, Ahmad Turan and Navas-Molina, Jose A and Nothias, Louis Felix and Orchanian, Stephanie B and Pearson, Talima and Peoples, Samuel L and Petras, Daniel and Preuss, Mary Lai and Pruesse, Elmar and Rasmussen, Lasse Buur and Rivers, Adam and Robeson, II, Michael S and Rosenthal, Patrick and Segata, Nicola and Shaffer, Michael and Shiffer, Arron and Sinha, Rashmi and Song, Se Jin and Spear, John R and Swafford, Austin D and Thompson, Luke R and Torres, Pedro J and Trinh, Pauline and Tripathi, Anupriya and Turnbaugh, Peter J and Ul-Hasan, Sabah and van der Hooft, Justin JJ and Vargas, Fernando and Vázquez-Baeza, Yoshiki and Vogtmann, Emily and von Hippel, Max and Walters, William and Wan, Yunhu and Wang, Mingxun and Warren, Jonathan and Weber, Kyle C and Williamson, Chase HD and Willis, Amy D and Xu, Zhenjiang Zech and Zaneveld, Jesse R and Zhang, Yilong and Knight, Rob and Caporaso, J Gregory},
 doi = {10.7287/peerj.preprints.27295v1},
 issn = {2167-9843},
 journal = {PeerJ Preprints},
 month = {oct},
 pages = {e27295v1},
 title = {QIIME 2: Reproducible, interactive, scalable, and extensible microbiome data science},
 url = {https://doi.org/10.7287/peerj.preprints.27295v1},
 volume = {6},
 year = {2018}
}

@article{action|feature-classifier:2019.10.0|method:classify_sklearn|0,
 author = {Pedregosa, Fabian and Varoquaux, Gaël and Gramfort, Alexandre and Michel, Vincent and Thirion, Bertrand and Grisel, Olivier and Blondel, Mathieu and Prettenhofer, Peter and Weiss, Ron and Dubourg, Vincent and Vanderplas, Jake and Passos, Alexandre and Cournapeau, David and Brucher, Matthieu and Perrot, Matthieu and Duchesnay, Édouard},
 journal = {Journal of machine learning research},
 number = {Oct},
 pages = {2825--2830},
 title = {Scikit-learn: Machine learning in Python},
 volume = {12},
 year = {2011}
}

@article{plugin|feature-classifier:2019.10.0|0,
 author = {Bokulich, Nicholas A. and Kaehler, Benjamin D. and Rideout, Jai Ram and Dillon, Matthew and Bolyen, Evan and Knight, Rob and Huttley, Gavin A. and Caporaso, J. Gregory},
 doi = {10.1186/s40168-018-0470-z},
 journal = {Microbiome},
 number = {1},
 pages = {90},
 title = {Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2's q2-feature-classifier plugin},
 url = {https://doi.org/10.1186/s40168-018-0470-z},
 volume = {6},
 year = {2018}
}

@article{view|types:2019.10.0|biom.table:Table|0,
 author = {McDonald, Daniel and Clemente, Jose C and Kuczynski, Justin and Rideout, Jai Ram and Stombaugh, Jesse and Wendel, Doug and Wilke, Andreas and Huse, Susan and Hufnagle, John and Meyer, Folker and Knight, Rob and Caporaso, J Gregory},
 doi = {10.1186/2047-217X-1-7},
 journal = {GigaScience},
 number = {1},
 pages = {7},
 publisher = {BioMed Central},
 title = {The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome},
 volume = {1},
 year = {2012}
}

@article{plugin|feature-classifier:2019.4.0|0,
 author = {Bokulich, Nicholas A. and Kaehler, Benjamin D. and Rideout, Jai Ram and Dillon, Matthew and Bolyen, Evan and Knight, Rob and Huttley, Gavin A. and Caporaso, J. Gregory},
 doi = {10.1186/s40168-018-0470-z},
 journal = {Microbiome},
 number = {1},
 pages = {90},
 title = {Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2's q2-feature-classifier plugin},
 url = {https://doi.org/10.1186/s40168-018-0470-z},
 volume = {6},
 year = {2018}
}

@article{action|feature-classifier:2019.4.0|method:fit_classifier_naive_bayes|0,
 author = {Pedregosa, Fabian and Varoquaux, Gaël and Gramfort, Alexandre and Michel, Vincent and Thirion, Bertrand and Grisel, Olivier and Blondel, Mathieu and Prettenhofer, Peter and Weiss, Ron and Dubourg, Vincent and Vanderplas, Jake and Passos, Alexandre and Cournapeau, David and Brucher, Matthieu and Perrot, Matthieu and Duchesnay, Édouard},
 journal = {Journal of machine learning research},
 number = {Oct},
 pages = {2825--2830},
 title = {Scikit-learn: Machine learning in Python},
 volume = {12},
 year = {2011}
}

@inproceedings{view|types:2019.4.1|pandas.core.series:Series|0,
 author = { Wes McKinney },
 booktitle = { Proceedings of the 9th Python in Science Conference },
 editor = { Stéfan van der Walt and Jarrod Millman },
 pages = { 51 -- 56 },
 title = { Data Structures for Statistical Computing in Python },
 year = { 2010 }
}

As you can see, the citations for this particular visualization are presented above in BibTeX format.

We can also see the citations for a specific plugin:

qiime vsearch --citations

stdout:

% use `qiime tools citations` on a QIIME 2 result for complete list

@article{key0,
 author = {Rognes, Torbjørn and Flouri, Tomáš and Nichols, Ben and Quince, Christopher and Mahé, Frédéric},
 doi = {10.7717/peerj.2584},
 journal = {PeerJ},
 pages = {e2584},
 publisher = {PeerJ Inc.},
 title = {VSEARCH: a versatile open source tool for metagenomics},
 volume = {4},
 year = {2016}
}

And also for a specific action of a plugin:

qiime vsearch cluster-features-open-reference --citations

stdout:

% use `qiime tools citations` on a QIIME 2 result for complete list

@article{key0,
 author = {Rideout, Jai Ram and He, Yan and Navas-Molina, Jose A. and Walters, William A. and Ursell, Luke K. and Gibbons, Sean M. and Chase, John and McDonald, Daniel and Gonzalez, Antonio and Robbins-Pianka, Adam and Clemente, Jose C. and Gilbert, Jack A. and Huse, Susan M. and Zhou, Hong-Wei and Knight, Rob and Caporaso, J. Gregory},
 doi = {10.7717/peerj.545},
 journal = {PeerJ},
 pages = {e545},
 publisher = {PeerJ Inc.},
 title = {Subsampled open-reference clustering creates consistent, comprehensive OTU definitions and scales to billions of sequences},
 volume = {2},
 year = {2014}
}

Viewing Visualizations

What if we want to view our taxa bar plots? One option is to load the visualization at https://view.qiime2.org. All QIIME 2 Results may be opened this way. This will present the visualization (assuming the file is a .qzv), Result details (e.g. filename, uuid, type, format, citations), and a provenance graph showing how the Visualization or Artifact was created.

Note

Provenance viewing is only available at https://view.qiime2.org.

Another option is to use qiime tools view to accomplish the job. This command may only be used with Visualizations, and will not display Visualization details (see Peeking at Results) or provenence, but provides a quick and easy way to view your results from the command line.

qiime tools view taxa-barplot.qzv

This will open a browser window with your visualization loaded in it. When you are done, you can close the browser window and press ctrl-c on the keyboard to terminate the command.

Peeking at Results

Oftentimes we need to verify the type and uuid of an Artifact. We can use the qiime tools peek command to view a brief summary report of those facts. First, let’s get some data to look at:

Please select a download option that is most appropriate for your environment:
wget \
  -O "faith-pd-vector.qza" \
  "https://data.qiime2.org/2021.11/tutorials/utilities/faith-pd-vector.qza"
curl -sL \
  "https://data.qiime2.org/2021.11/tutorials/utilities/faith-pd-vector.qza" > \
  "faith-pd-vector.qza"

Now that we have data, we can learn more about the file:

qiime tools peek faith-pd-vector.qza

stdout:

UUID:        d5186dce-438d-44bb-903c-cb51a7ad4abe
Type:        SampleData[AlphaDiversity] % Properties('phylogenetic')
Data format: AlphaDiversityDirectoryFormat

Output artifacts:

Here we can see that the type of the Artifact is SampleData[AlphaDiversity] % Properties('phylogenetic'), as well as the Artifact’s UUID and format.

Validating Results

We can also validate the integrity of the file by running qiime tools validate:

qiime tools validate faith-pd-vector.qza

stdout:

Result faith-pd-vector.qza appears to be valid at level=max.

If there was an issue with the file, this command will usually do a good job of reporting what the problem is (within reason).

Inspecting Metadata

In the Metadata tutorial we learned about the metadata tabulate command, and the resulting visualization it creates. Oftentimes we don’t care so much about the values of the Metadata, but rather, just the shape of it: how many columns? What are their names? What are their types? How many rows (or IDs) are in the file?

We can demonstrate this by first downloading some sample metadata:

Please select a download option that is most appropriate for your environment:
wget \
  -O "sample-metadata.tsv" \
  "https://data.qiime2.org/2021.11/tutorials/pd-mice/sample_metadata.tsv"
curl -sL \
  "https://data.qiime2.org/2021.11/tutorials/pd-mice/sample_metadata.tsv" > \
  "sample-metadata.tsv"

Then, we can run the qiime tools inspect-metadata command:

qiime tools inspect-metadata sample-metadata.tsv

stdout:

              COLUMN NAME  TYPE       
=========================  ===========
                  barcode  categorical
                 mouse_id  categorical
                 genotype  categorical
                  cage_id  categorical
                    donor  categorical
             donor_status  categorical
     days_post_transplant  numeric    
genotype_and_donor_status  categorical
=========================  ===========
                     IDS:  48
                 COLUMNS:  8

Question

How many metadata columns are there in sample-metadata.tsv? How many IDs? Identify how many categorical columns are present. Now do the same for numeric columns.

This tool can be very helpful for learning about Metadata column names for files that are viewable as Metadata.

Please select a download option that is most appropriate for your environment:
wget \
  -O "jaccard-pcoa.qza" \
  "https://data.qiime2.org/2021.11/tutorials/utilities/jaccard-pcoa.qza"
curl -sL \
  "https://data.qiime2.org/2021.11/tutorials/utilities/jaccard-pcoa.qza" > \
  "jaccard-pcoa.qza"

The file we just downloaded is a Jaccard PCoA (from the PD Mice Tutorial), which, can be used in place of the “typical” TSV-formatted Metadata file. We might need to know about column names for commands we wish to run, using inspect-metadata, we can learn all about it:

qiime tools inspect-metadata jaccard-pcoa.qza

stdout:

COLUMN NAME  TYPE   
===========  =======
     Axis 1  numeric
     Axis 2  numeric
     Axis 3  numeric
     Axis 4  numeric
     Axis 5  numeric
     Axis 6  numeric
     Axis 7  numeric
     Axis 8  numeric
     Axis 9  numeric
    Axis 10  numeric
    Axis 11  numeric
    Axis 12  numeric
    Axis 13  numeric
    Axis 14  numeric
    Axis 15  numeric
    Axis 16  numeric
    Axis 17  numeric
    Axis 18  numeric
    Axis 19  numeric
    Axis 20  numeric
    Axis 21  numeric
    Axis 22  numeric
    Axis 23  numeric
    Axis 24  numeric
    Axis 25  numeric
    Axis 26  numeric
    Axis 27  numeric
    Axis 28  numeric
    Axis 29  numeric
    Axis 30  numeric
    Axis 31  numeric
    Axis 32  numeric
    Axis 33  numeric
    Axis 34  numeric
    Axis 35  numeric
    Axis 36  numeric
    Axis 37  numeric
    Axis 38  numeric
    Axis 39  numeric
    Axis 40  numeric
    Axis 41  numeric
    Axis 42  numeric
    Axis 43  numeric
    Axis 44  numeric
    Axis 45  numeric
    Axis 46  numeric
    Axis 47  numeric
===========  =======
       IDS:  47
   COLUMNS:  47

Output artifacts:

Question

How many IDs are there? How many columns? Are there any categorical columns? Why?

Casting Metadata Column Types

In the Metadata tutorial we learned about column types and utilizing the qiime tools cast-metadata tool to specifiy column types within a provided metadata file. Below we will go through a few scenarios of how this tool can be used, and some common mistakes that may come up.

We’ll start by first downloading some sample metadata. Note: This is the same sample metadata used in the Inspect Metadata section, so you can skip this step if you have already downloaded the sample_metadata.tsv file from above.

Please select a download option that is most appropriate for your environment:
wget \
  -O "sample_metadata.tsv" \
  "https://data.qiime2.org/2021.11/tutorials/pd-mice/sample_metadata.tsv"
curl -sL \
  "https://data.qiime2.org/2021.11/tutorials/pd-mice/sample_metadata.tsv" > \
  "sample_metadata.tsv"

In this example, we will cast the days_post_transplant column from numeric to categorical, and the mouse_id column from categorical to numeric. The rest of the columns contained within our metadata will be left as-is.

qiime tools cast-metadata sample_metadata.tsv \
  --cast days_post_transplant:categorical \
  --cast mouse_id:numeric

stdout:

sample_name	barcode	mouse_id	genotype	cage_id	donor	donor_status	days_post_transplant	genotype_and_donor_status
#q2:types	categorical	numeric	categorical	categorical	categorical	categorical	categorical	categorical
recip.220.WT.OB1.D7	CCTCCGTCATGG	457	wild type	C35	hc_1	Healthy	49	wild type and Healthy
recip.290.ASO.OB2.D1	AACAGTAAACAA	456	susceptible	C35	hc_1	Healthy	49	susceptible and Healthy
recip.389.WT.HC2.D21	ATGTATCAATTA	435	susceptible	C31	hc_1	Healthy	21	susceptible and Healthy
recip.391.ASO.PD2.D14	GTCAGTATGGCT	435	susceptible	C31	hc_1	Healthy	14	susceptible and Healthy
recip.391.ASO.PD2.D21	AGACAGTAGGAG	437	susceptible	C31	hc_1	Healthy	21	susceptible and Healthy
recip.391.ASO.PD2.D7	GGTCTTAGCACC	435	susceptible	C31	hc_1	Healthy	7	susceptible and Healthy
recip.400.ASO.HC2.D14	CGTTCGCTAGCC	437	susceptible	C31	hc_1	Healthy	14	susceptible and Healthy
recip.401.ASO.HC2.D7	ATTTACAATTGA	437	susceptible	C31	hc_1	Healthy	7	susceptible and Healthy
recip.403.ASO.PD2.D21	CGCAGATTAGTA	456	susceptible	C35	hc_1	Healthy	21	susceptible and Healthy
recip.411.ASO.HC2.D14	ATGTTAGGGAAT	456	susceptible	C35	hc_1	Healthy	14	susceptible and Healthy
recip.411.ASO.HC2.D21	CTCATATGCTAT	457	wild type	C35	hc_1	Healthy	21	wild type and Healthy
recip.411.ASO.HC2.D49	GCAACGAACGAG	435	susceptible	C31	hc_1	Healthy	49	susceptible and Healthy
recip.412.ASO.HC2.D14	AAGTGGCTATCC	457	wild type	C35	hc_1	Healthy	14	wild type and Healthy
recip.412.ASO.HC2.D7	GCATTCGGCGTT	456	susceptible	C35	hc_1	Healthy	7	susceptible and Healthy
recip.413.WT.HC2.D7	ACCAGTGACTCA	457	wild type	C35	hc_1	Healthy	7	wild type and Healthy
recip.456.ASO.HC3.D49	ACGGCGTTATGT	468	wild type	C42	hc_1	Healthy	49	wild type and Healthy
recip.458.ASO.HC3.D21	ACGGCCCTGGAG	468	wild type	C42	hc_1	Healthy	21	wild type and Healthy
recip.458.ASO.HC3.D49	CATTTGACGACG	469	wild type	C42	hc_1	Healthy	49	wild type and Healthy
recip.459.WT.HC3.D14	ACATGGGCGGAA	468	wild type	C42	hc_1	Healthy	14	wild type and Healthy
recip.459.WT.HC3.D21	CATAAATTCTTG	469	wild type	C42	hc_1	Healthy	21	wild type and Healthy
recip.459.WT.HC3.D49	GCTGCGTATACC	536	susceptible	C43	pd_1	PD	49	susceptible and PD
recip.460.WT.HC3.D14	CTGCGGATATAC	469	wild type	C42	hc_1	Healthy	14	wild type and Healthy
recip.460.WT.HC3.D21	GTCAATTAGTGG	536	susceptible	C43	pd_1	PD	21	susceptible and PD
recip.460.WT.HC3.D49	GAGAAGCTTATA	537	wild type	C43	pd_1	PD	49	wild type and PD
recip.460.WT.HC3.D7	GACCCGTTTCGC	468	wild type	C42	hc_1	Healthy	7	wild type and Healthy
recip.461.ASO.HC3.D21	AGCCCGCAAAGG	537	wild type	C43	pd_1	PD	21	wild type and PD
recip.461.ASO.HC3.D49	GGCGTAACGGCA	538	wild type	C44	pd_1	PD	49	wild type and PD
recip.461.ASO.HC3.D7	ATTGCCTTGATT	469	wild type	C42	hc_1	Healthy	7	wild type and Healthy
recip.462.WT.PD3.D14	GTGAGGGCAAGT	536	susceptible	C43	pd_1	PD	14	susceptible and PD
recip.462.WT.PD3.D21	GGCCTATAAGTC	538	wild type	C44	pd_1	PD	21	wild type and PD
recip.462.WT.PD3.D49	AATACAGACCTG	539	susceptible	C44	pd_1	PD	49	susceptible and PD
recip.462.WT.PD3.D7	TTAGGATTCTAT	536	susceptible	C43	pd_1	PD	7	susceptible and PD
recip.463.WT.PD3.D14	ATATTGGCAGCC	537	wild type	C43	pd_1	PD	14	wild type and PD
recip.463.WT.PD3.D21	CGCGGCGCAGCT	539	susceptible	C44	pd_1	PD	21	susceptible and PD
recip.463.WT.PD3.D7	GTTTATCTTAAG	537	wild type	C43	pd_1	PD	7	wild type and PD
recip.464.WT.PD3.D14	TCATCCGTCGGC	538	wild type	C44	pd_1	PD	14	wild type and PD
recip.465.ASO.PD3.D14	GGCTTCGGAGCG	539	susceptible	C44	pd_1	PD	14	susceptible and PD
recip.465.ASO.PD3.D7	CAGTCTAGTACG	538	wild type	C44	pd_1	PD	7	wild type and PD
recip.466.ASO.PD3.D7	GTGGGACTGCGC	539	susceptible	C44	pd_1	PD	7	susceptible and PD
recip.467.WT.HC3.D49.a	GTCAGGTGCGGC	437	susceptible	C31	hc_1	Healthy	49	susceptible and Healthy
recip.467.WT.HC3.D49.b	GTTAACTTACTA	546	susceptible	C49	pd_1	PD	49	susceptible and PD
recip.536.ASO.PD4.D49	CAAATTCGGGAT	547	wild type	C49	pd_1	PD	49	wild type and PD
recip.537.WT.PD4.D21	CTCTATTCCACC	546	susceptible	C49	pd_1	PD	21	susceptible and PD
recip.538.WT.PD4.D21	ATGGATAGCTAA	547	wild type	C49	pd_1	PD	21	wild type and PD
recip.539.ASO.PD4.D14	GATCCGGCAGGA	546	susceptible	C49	pd_1	PD	14	susceptible and PD
recip.539.ASO.PD4.D7	GTTCGAGTGAAT	546	susceptible	C49	pd_1	PD	7	susceptible and PD
recip.540.ASO.HC4.D14	CTTCCAACTCAT	547	wild type	C49	pd_1	PD	14	wild type and PD
recip.540.ASO.HC4.D7	CGGCCTAAGTTC	547	wild type	C49	pd_1	PD	7	wild type and PD

If the --output-file flag is enabled, the specified output file will contain the modified column types that we cast above, along with the rest of the columns and associated data contained in sample_metadata.tsv.

If you do not wish to save your cast metadata to an output file, you can omit the --output-file parameter and the results will be output to sdtout (as shown in the example above).

The --ignore-extra and --error-on-missing flags are used to handle cast columns not contained within the original metadata file, and columns contained within the metadata file that aren’t included in the cast call, respectively. We can take a look at how these flags can be used below:

In the first example, we’ll take a look at utilizing the --ignore-extra flag when a column is cast that is not included within the original metadata file. Let’s start by looking at what will happen if an extra column is included and this flag is not enabled.

qiime tools cast-metadata sample_metadata.tsv \
  --cast spleen:numeric

stderr:

Usage: qiime tools cast-metadata [OPTIONS] METADATA...

Error: Invalid value for cast: The following cast columns were not found within the metadata: spleen

Notice that the spleen column included in the cast call results in a raised error. If we want to ignore any extra columns that are not present in the original metadata file, we can enable the --ignore-extra flag.

qiime tools cast-metadata sample_metadata.tsv \
  --cast spleen:numeric \
  --ignore-extra

When this flag is enabled, all columns included in the cast that are not present in the original metadata file will be ignored. Note that stdout for this example has been omitted since we will not see a raised error with this flag enabled.

In our second example, we’ll take a look at the --error-on-missing flag, which handles columns that are present within the metadata that are not included in the cast call.

The default behavior permits a subset of the full metadata file to be included in the cast call (e.g. not all columns within the metadata must be present in the cast call). If the --error-on-missing flag is enabled, all metadata columns must be included in the cast call, otherwise an error will be raised.

qiime tools cast-metadata sample_metadata.tsv \
  --cast mouse_id:numeric \
  --error-on-missing

stderr:

Usage: qiime tools cast-metadata [OPTIONS] METADATA...

Error: Invalid value for cast: The following columns within the metadata were not provided in the cast: donor, genotype_and_donor_status, donor_status, cage_id, barcode, genotype, days_post_transplant

Artifact API

Coming soon, please stay tuned!