Fork me on GitHub

Identifying and filtering chimeric feature sequences with q2-vsearch

Chimera checking in QIIME 2 is performed on a pair of FeatureTable[Frequency] and FeatureData[Sequences] artifacts. QIIME 2 wraps the Uchime de novo and reference pipelines from vsearch. For details on how these work, see the original Uchime paper, and the vsearch documentation.

In this tutorial, we’ll use the table and sequences from the Atacama soils tutorial.

Obtain the data

Start by creating a directory to work in.

mkdir qiime2-chimera-filtering-tutorial
cd qiime2-chimera-filtering-tutorial

Next, download the necessary files:

Please select a download option that is most appropriate for your environment:
wget \
  -O "atacama-table.qza" \
  "https://data.qiime2.org/2024.2/tutorials/chimera/atacama-table.qza"
curl -sL \
  "https://data.qiime2.org/2024.2/tutorials/chimera/atacama-table.qza" > \
  "atacama-table.qza"
Please select a download option that is most appropriate for your environment:
wget \
  -O "atacama-rep-seqs.qza" \
  "https://data.qiime2.org/2024.2/tutorials/chimera/atacama-rep-seqs.qza"
curl -sL \
  "https://data.qiime2.org/2024.2/tutorials/chimera/atacama-rep-seqs.qza" > \
  "atacama-rep-seqs.qza"

Run de novo chimera checking

qiime vsearch uchime-denovo \
  --i-table atacama-table.qza \
  --i-sequences atacama-rep-seqs.qza \
  --output-dir uchime-dn-out

Output artifacts:

Note

Reference-based chimera checking is also available - see vsearch uchime-ref for more details.

Visualize summary stats

To learn more about the sequences that were identified as chimeric, we can tabulate the stats output from the previous step:

qiime metadata tabulate \
  --m-input-file uchime-dn-out/stats.qza \
  --o-visualization uchime-dn-out/stats.qzv

Output visualizations:

Filter input tables and sequences

Exclude chimeras and “borderline chimeras”

qiime feature-table filter-features \
  --i-table atacama-table.qza \
  --m-metadata-file uchime-dn-out/nonchimeras.qza \
  --o-filtered-table uchime-dn-out/table-nonchimeric-wo-borderline.qza
qiime feature-table filter-seqs \
  --i-data atacama-rep-seqs.qza \
  --m-metadata-file uchime-dn-out/nonchimeras.qza \
  --o-filtered-data uchime-dn-out/rep-seqs-nonchimeric-wo-borderline.qza
qiime feature-table summarize \
  --i-table uchime-dn-out/table-nonchimeric-wo-borderline.qza \
  --o-visualization uchime-dn-out/table-nonchimeric-wo-borderline.qzv

Output artifacts:

  • uchime-dn-out/rep-seqs-nonchimeric-wo-borderline.qza: view | download

  • uchime-dn-out/table-nonchimeric-wo-borderline.qza: view | download

Output visualizations:

  • uchime-dn-out/table-nonchimeric-wo-borderline.qzv: view | download

Exclude chimeras but retain “borderline chimeras”

qiime feature-table filter-features \
  --i-table atacama-table.qza \
  --m-metadata-file uchime-dn-out/chimeras.qza \
  --p-exclude-ids \
  --o-filtered-table uchime-dn-out/table-nonchimeric-w-borderline.qza
qiime feature-table filter-seqs \
  --i-data atacama-rep-seqs.qza \
  --m-metadata-file uchime-dn-out/chimeras.qza \
  --p-exclude-ids \
  --o-filtered-data uchime-dn-out/rep-seqs-nonchimeric-w-borderline.qza
qiime feature-table summarize \
  --i-table uchime-dn-out/table-nonchimeric-w-borderline.qza \
  --o-visualization uchime-dn-out/table-nonchimeric-w-borderline.qzv

Output artifacts:

  • uchime-dn-out/table-nonchimeric-w-borderline.qza: view | download

  • uchime-dn-out/rep-seqs-nonchimeric-w-borderline.qza: view | download

Output visualizations:

  • uchime-dn-out/table-nonchimeric-w-borderline.qzv: view | download