2023

Quantitative proteomics: aptamer-based quantitation of proteins

5 minute read

Published:

The aptamer-based SomaScan® assay is one of the popular methods of measuring abundances of protein targets. There is very little information on correlation between mass spectrometry (MS)-based proteomics, SOMAscan and Olink assays; Olink is another popular high throughput antibody-based platform. Some studies also reported a measurement variation between those platforms. In general aptamers/SOMAmers are selected against target proteins in their native conformation and in some cases against a functional protein with “known” post translational modifications (PTMs). It’s well known that novel PTMs (pathogen or disease-induced) can impact the protein structure, electrophilicity and interactions with proteins. The other main disadvantage is quantification which is based on DNA microarray chips (background noise). The main advantages are lower cost and data analysis.

2022

Kaplan-Meier Curve using R

2 minute read

Published:

Kaplan-Meier curve shows what the probability of an event (for example, survival) at a certain time interval. The log-rank test compares the survival curves of two or more groups. With a small subset of patients, the Kaplan-Meier estimates can be misleading and should be interpreted with caution.

2021

Annotation of genetic variants

4 minute read

Published:

Tools such as ANNOVAR, Variant Effect Predictor (VEP) or SnpEff annotate genetic variants (SNPs, INDELS, CNVs etc) present in VCF file. These tools integrate the annotations within the INFO column of the original VCF file.

2020

2019

ATAC-seq peak calling with MACS2

2 minute read

Published:

ATAC-seq (Assay for Transposase Accessible Chromatin with high-throughput Sequencing) is a next-generation sequencing approach for the analysis of open chromatin regions to assess the genome-wise chromatin accessibility.

Taxonomic and diversity profiling of the microbiome - 16S rRNA gene amplicon sequence data

1 minute read

Published:

The 16S ribosomal RNA (rRNA) gene of Bacteria codes for the RNA component of the 30S subunit. Different bacterial species have one to multiple copies of the 16S rRNA gene, and each with 9 hypervariable regions, V1-V9. High-throughput sequencing of 16S rRNA gene (a “marker gene”) amplicons has become a widely used method to study bacterial phylogeny and species classification.

2018

Taxonomic and functional profiling of the microbiome - whole genome shotgun metagenomics

1 minute read

Published:

This workflow consists of taxonomic and functional profiling of shotgun metagenomics sequencing (MGS) reads using MetaPhlAn2 and HUMAnN2, respectively. To perform taxonomic (phyla, genera or species level) profiling of the MGS data, the MetaPhlAn2 pipeline was run on a high performance multicore cluster computing environment.

Genomic variants from RNA-Seq data

1 minute read

Published:

RNA-Seq allows the detection and quantification of known and rare RNA transcripts within a sample. In addition to differential expression and detection of novel transcripts, RNA-seq also supports the detection of genomic variation in expressed regions.

2017

eQTL analysis of RNA-Seq data

1 minute read

Published:

Genetic locus that affects gene expression is often referred to as expression quantitative trait locus (eQTL). eQTL mapping studies assesses the association of SNPs with genome-wide expression levels.
Based on the hg38 reference genome, paired-end reads are mapped by STAR aligner. The mapped reads are used for expression quantification without assembling transcripts by counting the number of reads that map to an exon by HTSeq that uses Refseq gene annotations. Then, to correct for systematic variability such as library fragment size, sequence composition bias, and read depth the raw counts are normalized as trimmed mean of M-values (TMM) through edgeR.

Quality control for GWAS studies

1 minute read

Published:

An important step in the analysis of genome-wide association studies (GWAS) is to identify problematic subjects and markers. Quality control (QC) in GWAS removes markers and individuals, and greatly increases the accuracy of findings.

Copy number variation discovery workflows using NGS data.

less than 1 minute read

Published:

Copy number variations (CNVs) represent gain or loss of genomic regions. CNVs transmit from parents to offspring or arise de novo and play important role in neuro-psychiatric disorders and cancers.