eQTL analysis of RNA-Seq data

1 minute read


Genetic locus that affects gene expression is often referred to as expression quantitative trait locus (eQTL). eQTL mapping studies assesses the association of SNPs with genome-wide expression levels.
Based on the hg38 reference genome, paired-end reads are mapped by STAR aligner. The mapped reads are used for expression quantification without assembling transcripts by counting the number of reads that map to an exon by HTSeq that uses Refseq gene annotations. Then, to correct for systematic variability such as library fragment size, sequence composition bias, and read depth the raw counts are normalized as trimmed mean of M-values (TMM) through edgeR.

Beyond quantifying gene expression, the data generated by RNA-Seq facilitate the integration of expression data with genotyping data also known as expression quantitative trait loci (eQTLs) analysis. Infinium CytoSNP-850K v1.2 arrays are used to identify genetic and structural variations. The array data is analyzed using GenomeStudioR or BlueFuse Multi software based on the reference human genome (hg38/GRCh38). After loading the raw array data, the SNP manifest file (.bpm), and standard cluster file (.egt) are imported into GenomeStudio and clustering of intensities for SNPs are performed. Genotyping calls for a specific DNA made by the calling algorithm (GenCall) which relies on information provided by the GenTrain clustering algorithm.

Matrix eQTL was used to efficiently test the associations by modeling the effect of genotype as additive linear.

Data analysis pipeline for RNA-Seq based eQTL mapping: