Comparative analysis of transcription start site using mutual information

Published in Genomics Proteomics Bioinformatics, 2006

Recommended citation: Ashok Reddy D and Mitra C K. (2006). "Comparative analysis of transcription start site using mutual information." Geno. Prot. Bioinf.. 4(3), 183-195. http://adinasarapu.github.io/files/geno2006.pdf

The transcription start site (TSS) region shows greater variability compared with other promoter elements. We are interested to search for its variability by using information content as a measure. We note in this study that the variability is significant in the block of 5 nucleotides (nt) surrounding the TSS region compared with the block of 15 nt. This suggests that the actual region that may be involved is in the range of 5–10 nt in size. For Escherichia coli, we note that the information content from dinucleotide substitution matrices clearly shows a better discrimination, suggesting the presence of some correlations. However, for human this effect is much less, and for mouse it is practically absent. We can conclude that the presence of short-range correlations within the TSS region is species-dependent and is not universal. We further observe that there are other variable regions in the mitochondrial control element apart from TSS. It is also noted that effective comparisons can only be made on blocks, while single nucleotide comparisons do not give us any detectable signals.

Download paper here