Identifying and filtering chimeric feature sequences with q2-vsearch¶
Chimera checking in QIIME 2 is performed on a pair of FeatureTable[Frequency]
and FeatureData[Sequences]
artifacts. QIIME 2 wraps the Uchime de novo and reference pipelines from vsearch. For details on how these work, see the original Uchime paper, and the vsearch documentation.
In this tutorial, we’ll use the table and sequences from the Atacama soils tutorial.
Obtain the data¶
Start by creating a directory to work in.
mkdir qiime2-chimera-filtering-tutorial
cd qiime2-chimera-filtering-tutorial
Next, download the necessary files:
Download URL: https://data.qiime2.org/2024.10/tutorials/chimera/atacama-table.qza
Save as: atacama-table.qza
wget \
-O "atacama-table.qza" \
"https://data.qiime2.org/2024.10/tutorials/chimera/atacama-table.qza"
curl -sL \
"https://data.qiime2.org/2024.10/tutorials/chimera/atacama-table.qza" > \
"atacama-table.qza"
Download URL: https://data.qiime2.org/2024.10/tutorials/chimera/atacama-rep-seqs.qza
Save as: atacama-rep-seqs.qza
wget \
-O "atacama-rep-seqs.qza" \
"https://data.qiime2.org/2024.10/tutorials/chimera/atacama-rep-seqs.qza"
curl -sL \
"https://data.qiime2.org/2024.10/tutorials/chimera/atacama-rep-seqs.qza" > \
"atacama-rep-seqs.qza"
Run de novo chimera checking¶
qiime vsearch uchime-denovo \
--i-table atacama-table.qza \
--i-sequences atacama-rep-seqs.qza \
--output-dir uchime-dn-out
Output artifacts:
Note
Reference-based chimera checking is also available - see vsearch uchime-ref for more details.
Visualize summary stats¶
To learn more about the sequences that were identified as chimeric, we can tabulate
the stats output from the previous step:
qiime metadata tabulate \
--m-input-file uchime-dn-out/stats.qza \
--o-visualization uchime-dn-out/stats.qzv
Filter input tables and sequences¶
Exclude chimeras and “borderline chimeras”¶
qiime feature-table filter-features \
--i-table atacama-table.qza \
--m-metadata-file uchime-dn-out/nonchimeras.qza \
--o-filtered-table uchime-dn-out/table-nonchimeric-wo-borderline.qza
qiime feature-table filter-seqs \
--i-data atacama-rep-seqs.qza \
--m-metadata-file uchime-dn-out/nonchimeras.qza \
--o-filtered-data uchime-dn-out/rep-seqs-nonchimeric-wo-borderline.qza
qiime feature-table summarize \
--i-table uchime-dn-out/table-nonchimeric-wo-borderline.qza \
--o-visualization uchime-dn-out/table-nonchimeric-wo-borderline.qzv
Exclude chimeras but retain “borderline chimeras”¶
qiime feature-table filter-features \
--i-table atacama-table.qza \
--m-metadata-file uchime-dn-out/chimeras.qza \
--p-exclude-ids \
--o-filtered-table uchime-dn-out/table-nonchimeric-w-borderline.qza
qiime feature-table filter-seqs \
--i-data atacama-rep-seqs.qza \
--m-metadata-file uchime-dn-out/chimeras.qza \
--p-exclude-ids \
--o-filtered-data uchime-dn-out/rep-seqs-nonchimeric-w-borderline.qza
qiime feature-table summarize \
--i-table uchime-dn-out/table-nonchimeric-w-borderline.qza \
--o-visualization uchime-dn-out/table-nonchimeric-w-borderline.qzv