Abstract
AbstractAs their statistical power grows, genome-wide association studies (GWAS) have identified an increasing number of loci underlying quantitative traits of interest. These loci are scattered throughout the genome and are individually responsible only for small fractions of the total heritable trait variance. The recently proposed omnigenic model provides a conceptual framework to explain these observations by postulating that numerous distant loci contribute to each complex trait via effect propagation through intracellular regulatory networks. We formalize this conceptual framework by proposing the “quantitative omnigenic model” (QOM), a statistical model that combines prior knowledge of the regulatory network topology with genomic data. By applying our model to gene expression traits in yeast, we demonstrate that QOM achieves similar gene expression prediction performance to traditional GWAS with hundreds of times less parameters, while simultaneously extracting candidate causal and quantitative chains of effect propagation through the regulatory network for every individual gene. We estimate the fraction of heritable trait variance incis-and intrans-, break the latter down by effect propagation order, assess thetrans-variance not attributable to transcriptional regulation, and show that QOM correctly accounts for the low-dimensional structure of gene expression covariance. We furthermore demonstrate the relevance of QOM for systems biology, by employing it as a statistical test for the quality of regulatory network reconstructions, and linking it to the propagation of non-genetic, environmental effects.Significance statementGenetic variation leads to differences in traits implicated in health and disease. Identifying genetic variants associated with these traits is spearheaded by “genome-wide association studies” (GWAS) – statistically rigorous procedures whose power has grown with the number of genotyped samples. Nevertheless, GWAS have a substantial shortcoming: they are ill-equipped to detect the causal basis and reveal the complex systemic mechanisms of polygenic traits. Even a single genetic change can propagate throughout the entire genetic regulatory network causing a myriad of spurious detections, thereby significantly limiting GWAS usefulness. We tackle this challenge with a novel statistical approach that incorporates known regulatory network information to substantially boost the interpretability of state-of-the-art genomic analyses while simultaneously extracting novel systems biology insights.
Publisher
Cold Spring Harbor Laboratory