Leveraging supervised learning for functionally informed fine-mapping of cis-eQTLs identifies an additional 20,913 putative causal eQTLs
-
Published:2021-06-07
Issue:1
Volume:12
Page:
-
ISSN:2041-1723
-
Container-title:Nature Communications
-
language:en
-
Short-container-title:Nat Commun
Author:
Wang Qingbo S.ORCID, Kelley David R.ORCID, Ulirsch JacobORCID, Kanai MasahiroORCID, Sadhuka Shuvom, Cui Ran, Albors Carlos, Cheng Nathan, Okada YukinoriORCID, Matsuda Koichi, Yamanashi Yuji, Furukawa Yoichi, Morisaki Takayuki, Murakami Yoshinori, Kamatani Yoichiro, Muto Kaori, Nagai Akiko, Obara Wataru, Yamaji Ken, Takahashi Kazuhisa, Asai Satoshi, Takahashi Yasuo, Suzuki Takao, Sinozaki Nobuaki, Yamaguchi Hiroki, Minami Shiro, Murayama Shigeo, Yoshimori Kozo, Nagayama Satoshi, Obata Daisuke, Higashiyama Masahiko, Masumoto Akihide, Koretsune Yukihiro, Aguet FrancoisORCID, Ardlie Kristin G., MacArthur Daniel G.ORCID, Finucane Hilary K.ORCID,
Abstract
AbstractThe large majority of variants identified by GWAS are non-coding, motivating detailed characterization of the function of non-coding variants. Experimental methods to assess variants’ effect on gene expressions in native chromatin context via direct perturbation are low-throughput. Existing high-throughput computational predictors thus have lacked large gold standard sets of regulatory variants for training and validation. Here, we leverage a set of 14,807 putative causal eQTLs in humans obtained through statistical fine-mapping, and we use 6121 features to directly train a predictor of whether a variant modifies nearby gene expression. We call the resulting prediction the expression modifier score (EMS). We validate EMS by comparing its ability to prioritize functional variants with other major scores. We then use EMS as a prior for statistical fine-mapping of eQTLs to identify an additional 20,913 putatively causal eQTLs, and we incorporate EMS into co-localization analysis to identify 310 additional candidate genes across UK Biobank phenotypes.
Publisher
Springer Science and Business Media LLC
Subject
General Physics and Astronomy,General Biochemistry, Genetics and Molecular Biology,General Chemistry
Reference61 articles.
1. Maurano, M. T. et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012). 2. Paul, D. S., Soranzo, N. & Beck, S. Functional interpretation of non-coding sequence variation: concepts and challenges. Bioessays 36, 191–199 (2014). 3. Maller, J. B. et al. Bayesian refinement of association signals for 14 loci in 3 common diseases. Nat. Genet. 44, 1294–1301 (2012). 4. The ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012). 5. Roadmap Epigenomics Consortium. et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
Cited by
56 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|