2021

Annotation of genetic variants

4 minute read

Published:

Tools such as ANNOVAR, Variant Effect Predictor (VEP) or SnpEff annotate genetic variants (SNPs, INDELS, CNVs etc) present in VCF file. These tools integrate the annotations within the INFO column of the original VCF file.

2020

2019

ATAC-seq peak calling with MACS2

2 minute read

Published:

ATAC-seq (Assay for Transposase Accessible Chromatin with high-throughput Sequencing) is a next-generation sequencing approach for the analysis of open chromatin regions to assess the genome-wise chromatin accessibility.

Taxonomic and diversity profiling of the microbiome - 16S rRNA gene amplicon sequence data

1 minute read

Published:

The 16S ribosomal RNA (rRNA) gene of Bacteria codes for the RNA component of the 30S subunit. Different bacterial species have one to multiple copies of the 16S rRNA gene, and each with 9 hypervariable regions, V1-V9. High-throughput sequencing of 16S rRNA gene (a “marker gene”) amplicons has become a widely used method to study bacterial phylogeny and species classification.

2018

Taxonomic and functional profiling of the microbiome - whole genome shotgun metagenomics

1 minute read

Published:

This workflow consists of taxonomic and functional profiling of shotgun metagenomics sequencing (MGS) reads using MetaPhlAn2 and HUMAnN2, respectively. To perform taxonomic (phyla, genera or species level) profiling of the MGS data, the MetaPhlAn2 pipeline was run on a high performance multicore cluster computing environment.

Genomic variants from RNA-Seq data

1 minute read

Published:

RNA-Seq allows the detection and quantification of known and rare RNA transcripts within a sample. In addition to differential expression and detection of novel transcripts, RNA-seq also supports the detection of genomic variation in expressed regions.

2017

eQTL analysis of RNA-Seq data

1 minute read

Published:

Genetic locus that affects gene expression is often referred to as expression quantitative trait locus (eQTL). eQTL mapping studies assesses the association of SNPs with genome-wide expression levels.
Based on the hg38 reference genome, paired-end reads are mapped by STAR aligner. The mapped reads are used for expression quantification without assembling transcripts by counting the number of reads that map to an exon by HTSeq that uses Refseq gene annotations. Then, to correct for systematic variability such as library fragment size, sequence composition bias, and read depth the raw counts are normalized as trimmed mean of M-values (TMM) through edgeR.

Quality control for GWAS studies

1 minute read

Published:

An important step in the analysis of genome-wide association studies (GWAS) is to identify problematic subjects and markers. Quality control (QC) in GWAS removes markers and individuals, and greatly increases the accuracy of findings.

Copy number variation discovery workflows using NGS data.

less than 1 minute read

Published:

Copy number variations (CNVs) represent gain or loss of genomic regions. CNVs transmit from parents to offspring or arise de novo and play important role in neuro-psychiatric disorders and cancers.