Abstract
AbstractIn the past few years extensive studies have been put on the analysis of genome function, especially on expression quantitative trait loci (eQTL) which offered promise for characterization of the functional sequencing variation and for the understanding of the basic processes of gene regulation. However, most studies of eQTL mapping have not implemented models that allow for the non-equivalence of parental alleles as so-called parent-of-origin effects (POEs); thus, the number and effects of imprinted genes remain important open questions. Imprinting is a type of POE that the expression of certain genes depends on their allelic parent-of-origin which are important contributors to phenotypic variations, such as diabetes and many cancer types. Besides, multi-collinearity is an important issue arising from modeling multiple genetic effects. To address these challenges, we proposed a statistical framework to test the main allelic effects of the candidate eQTLs along with the POE with an orthogonal model for RNA sequencing (RNA-seq) data. Using simulations, we demonstrated the desirable power and Type I error of the orthogonal model which also achieved accurate estimation of the genetic effects and over-dispersion of the RNA-seq data. These methods were applied to an existing HapMap project trio dataset to validate the reported imprinted genes and to discovery novel imprinted genes. Using the orthogonal method, we validated existing imprinting genes and discovered two novel imprinting genes with significant dominance effect.Author SummaryIn the past decades, an unprecedented wealth of knowledge has been accumulated for understanding variations in human DNA level. However, this DNA-level knowledge has not been sufficiently translated to understanding the mechanisms of human diseases. Gene expression quantitative trait locus (eQTL) mapping is one of the most promising approaches to fill this gap, which aims to explore the genetic basis of gene expression. Genomic imprinting is an important epigenetic phenomenon which is an important contributor to phenotypic variation in human complex diseases and may explain some of the “hidden” heritable variability. Many imprinting genes are known to play important roles in human complex diseases such as diabetes, breast cancer and obesity. However, traditional eQTL mapping approaches does not allow for the detection of imprinting which is usually involved in gene expression imbalance. In this study, we have for the first time demonstrated the orthogonal statistical model can be applied to eQTL mapping for RNA sequencing (RNA-seq) data. We showed by simulated and real data that the orthogonal model outperformed the usual functional model for detecting main effects in most cases, which addressed the issue of confounding between the dominance and additive effects. Application of the statistical model to the HapMap data resulted in discovery of some potential eQTLs with imprinting effects and dominance effects on expression of RB1 and IGF1R genes.In summary, we developed a comprehensive framework for modeling imprinting effect for eQTL mapping, by decomposing the effects to multiple genetic components. This study is providing new insights into statistical modeling of eQTL mapping with RNA-seq data which allows for uncorrelated parameter estimation of genetic effects, covariates and over-dispersion parameter.
Publisher
Cold Spring Harbor Laboratory