Taxonomic and diversity profiling of the microbiome - 16S rRNA gene amplicon sequence data

1 minute read


The 16S ribosomal RNA (rRNA) gene of Bacteria codes for the RNA component of the 30S subunit. Different bacterial species have one to multiple copies of the 16S rRNA gene, and each with 9 hypervariable regions, V1-V9. High-throughput sequencing of 16S rRNA gene (a “marker gene”) amplicons has become a widely used method to study bacterial phylogeny and species classification.

Quantitative Insights Into Microbial Ecology “QIIME” 2 (release 2018.6)1 is a widely used package to identity abundance of microbes using 16s rRNA. Briefly, feature table containing counts of each unique sequence in the samples will be constructed using qiime dada2 denoise-paired method. A feature is essentially any unit of observation, e.g., an OTU (Operational Taxonomic Unit), a sequence variant, a gene or a metabolite. In QIIME2 (currently), most features will be OTUs or sequence variants (alternatively, for OTUs, use QIIME2 plugin q2-vsearch).

Data produced by QIIME 2 exist as QIIME 2 artifacts. A QIIME 2 artifact typically has the .qza file extension when output data stored in a file. Visualizations are another type of data (.qzv file extension) generated by QIIME 2, which can be viewed using a web interface (at Firefox web browser) without requiring a QIIME installation. Since QIIME 2 works with artifacts instead of data files (e.g. FASTA files), we must create a QIIME 2 artifact by importing our fastq.gz data files.

The above analyses also produce a key summary table or BIOM (Biological Observation Matrix) file containing feature (Operational Taxonomic Units, OTUs) abundance information across samples, along with various annotations and sample metadata. Alternatively, upload BIOM file to MicrobiomeAnalyst2, a web-based tool for comprehensive statistical, visual and meta-analysis of microbiome data, for taxonomic profiling - to characterize community compositions based on methods developed in ecology such as alpha-diversity (within-sample diversity) or beta-diversity (between-sample diversity) and comparative analysis - to identify features that are significantly different among conditions under study.

Scripts are available at amplicon_metagenomics