Comprehensive functional genomic resource and integrative model for the human brain
Author:
Wang Daifeng123ORCID, Liu Shuang12ORCID, Warrell Jonathan12ORCID, Won Hyejung45ORCID, Shi Xu12ORCID, Navarro Fabio C. P.12ORCID, Clarke Declan12ORCID, Gu Mengting1ORCID, Emani Prashant12ORCID, Yang Yucheng T.12ORCID, Xu Min12, Gandal Michael J.6ORCID, Lou Shaoke12, Zhang Jing12, Park Jonathan J.12ORCID, Yan Chengfei12, Rhie Suhn Kyong7ORCID, Manakongtreecheep Kasidet12, Zhou Holly12ORCID, Nathan Aparna12, Peters Mette8, Mattei Eugenio9ORCID, Fitzgerald Dominic10ORCID, Brunetti Tonya10, Moore Jill9ORCID, Jiang Yan11, Girdhar Kiran12ORCID, Hoffman Gabriel E.12ORCID, Kalayci Selim12ORCID, Gümüş Zeynep H.12ORCID, Crawford Gregory E.13ORCID, Roussos Panos1112ORCID, Akbarian Schahram1114, Jaffe Andrew E.15ORCID, White Kevin P.1016ORCID, Weng Zhiping9, Sestan Nenad17ORCID, Geschwind Daniel H.181920ORCID, Knowles James A.21ORCID, Gerstein Mark B.122223ORCID, Ashley-Koch Allison E., Crawford Gregory E., Garrett Melanie E., Song Lingyun, Safi Alexias, Johnson Graham D., Wray Gregory A., Reddy Timothy E, Goes Fernando S., Zandi Peter, Bryois Julien, Jaffe Andrew E., Price Amanda J., Ivanov Nikolay A., Collado-Torres Leonardo, Hyde Thomas M., Burke Emily E., Kleiman Joel E., Tao Ran, Shin Joo Heon, Akbarian Schahram, Girdhar Kiran, Jiang Yan, Kundakovic Marija, Brown Leanne, Kassim Bibi S., Park Royce B., Wiseman Jennifer R, Zharovsky Elizabeth, Jacobov Rivka, Devillers Olivia, Flatow Elie, Hoffman Gabriel E., Lipska Barbara K., Lewis David A., Haroutunian Vahram, Hahn Chang-Gyu, Charney Alexander W., Dracheva Stella, Kozlenkov Alexey, Belmont Judson, DelValle Diane, Francoeur Nancy, Hadjimichael Evi, Pinto Dalila, van Bakel Harm, Roussos Panos, Fullard John F., Bendl Jaroslav, Hauberg Mads E., Mangravite Lara M, Peters Mette A., Chae Yooree, Peng Junmin, Niu Mingming, Wang Xusheng, Webster Maree J., Beach Thomas G., Chen Chao, Jiang Yi, Dai Rujia, Shieh Annie W., Liu Chunyu, Grennan Kay S., Xia Yan, Vadukapuram Ramu, Wang Yongjun, Fitzgerald Dominic, Cheng Lijun, Brown Miguel, Brown Mimi, Brunetti Tonya, Goodman Thomas, Alsayed Majd, Gandal Michael J., Geschwind Daniel H., Won Hyejung, Polioudakis Damon, Wamsley Brie, Yin Jiani, Hadzic Tarik, De La Torre Ubieta Luis, Swarup Vivek, Sanders Stephan J., State Matthew W., Werling Donna M., An Joon-Yong, Sheppard Brooke, Willsey A. Jeremy, White Kevin P., Ray Mohana, Giase Gina, Kefi Amira, Mattei Eugenio, Purcaro Michael, Weng Zhiping, Moore Jill, Pratt Henry, Huey Jack, Borrman Tyler, Sullivan Patrick F., Giusti-Rodriguez Paola, Kim Yunjung, Sullivan Patrick, Szatkiewicz Jin, Rhie Suhn Kyong, Armoskus Christoper, Camarena Adrian, Farnham Peggy J., Spitsyna Valeria N., Witt Heather, Schreiner Shannon, Evgrafov Oleg V., Knowles James A., Gerstein Mark, Liu Shuang, Wang Daifeng, Navarro Fabio C. P., Warrell Jonathan, Clarke Declan, Emani Prashant S., Gu Mengting, Shi Xu, Xu Min, Yang Yucheng T., Kitchen Robert R., Gürsoy Gamze, Zhang Jing, Carlyle Becky C., Nairn Angus C., Li Mingfeng, Pochareddy Sirisha, Sestan Nenad, Skarica Mario, Li Zhen, Sousa Andre M. M., Santpere Gabriel, Choi Jinmyung, Zhu Ying, Gao Tianliuyun, Miller Daniel J., Cherskov Adriana, Yang Mo, Amiri Anahita, Coppola Gianfilippo, Mariani Jessica, Scuderi Soraya, Szekely Anna, Vaccarino Flora M., Wu Feinan, Weissman Sherman, Roychowdhury Tanmoy, Abyzov Alexej,
Affiliation:
1. Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA. 2. Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA. 3. Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY 11794, USA. 4. Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA. 5. UNC Neuroscience Center, University of North Carolina, Chapel Hill, NC 27599, USA. 6. Department of Psychiatry, Semel Institute, David Geffen School of Medicine, University of California–Los Angeles, 695 Charles E. Young Drive South, Los Angeles, CA 90095, USA. 7. Keck School of Medicine and Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, CA 90007, USA. 8. Sage Bionetworks, Seattle, WA 98109, USA. 9. Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA. 10. Institute for Genomics and Systems Biology, Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA. 11. Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA. 12. Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA. 13. Center for Genomic and Computational Biology, Department of Pediatrics, Duke University, Durham, NC 27708, USA. 14. Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA. 15. Lieber Institute for Brain Development, Johns Hopkins Medical Campus, and Departments of Mental Health and Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA. 16. Tempus Labs, Chicago, IL 60654, USA. 17. Department of Neuroscience and Kavli Institute for Neuroscience, Yale School of Medicine, New Haven, CT 06520, USA. 18. Department of Human Genetics, David Geffen School of Medicine, University of California–Los Angeles, Los Angeles, CA 90095, USA. 19. Program in Neurobehavioral Genetics, Semel Institute, David Geffen School of Medicine, University of California–Los Angeles, Los Angeles, CA 90095, USA. 20. Department of Neurology, Center for Autism Research and Treatment, Semel Institute, David Geffen School of Medicine, University of California–Los Angeles, Los Angeles, CA 90095, USA. 21. SUNY Downstate Medical Center College of Medicine, Brooklyn, NY 11203, USA. 22. Department of Computer Science, Yale University, New Haven, CT 06520, USA. 23. Department of Statistics and Data Science, Yale University, New Haven, CT 06520, USA.
Abstract
INTRODUCTION
Strong genetic associations have been found for a number of psychiatric disorders. However, understanding the underlying molecular mechanisms remains challenging.
RATIONALE
To address this challenge, the PsychENCODE Consortium has developed a comprehensive online resource and integrative models for the functional genomics of the human brain.
RESULTS
The base of the pyramidal resource is the datasets generated by PsychENCODE, including bulk transcriptome, chromatin, genotype, and Hi-C datasets and single-cell transcriptomic data from ~32,000 cells for major brain regions. We have merged these with data from Genotype-Tissue Expression (GTEx), ENCODE, Roadmap Epigenomics, and single-cell analyses. Via uniform processing, we created a harmonized resource, allowing us to survey functional genomics data on the brain over a sample size of 1866 individuals.
From this uniformly processed dataset, we created derived data products. These include lists of brain-expressed genes, coexpression modules, and single-cell expression profiles for many brain cell types; ~79,000 brain-active enhancers with associated Hi-C loops and topologically associating domains; and ~2.5 million expression quantitative-trait loci (QTLs) comprising ~238,000 linkage-disequilibrium–independent single-nucleotide polymorphisms and of other types of QTLs associated with splice isoforms, cell fractions, and chromatin activity. By using these, we found that >88% of the cross-population variation in brain gene expression can be accounted for by cell fraction changes. Furthermore, a number of disorders and aging are associated with changes in cell-type proportions. The derived data also enable comparison between the brain and other tissues. In particular, by using spectral analyses, we found that the brain has distinct expression and epigenetic patterns, including a greater extent of noncoding transcription than other tissues.
The top level of the resource consists of integrative networks for regulation and machine-learning models for disease prediction. The networks include a full gene regulatory network (GRN) for the brain, linking transcription factors, enhancers, and target genes from merging of the QTLs, generalized element-activity correlations, and Hi-C data. By using this network, we link disease genes to genome-wide association study (GWAS) variants for psychiatric disorders. For schizophrenia, we linked 321 genes to the 142 reported GWAS loci. We then embedded the regulatory network into a deep-learning model to predict psychiatric phenotypes from genotype and expression. Our model gives a ~6-fold improvement in prediction over additive polygenic risk scores. Moreover, it achieves a ~3-fold improvement over additive models, even when the gene expression data are imputed, highlighting the value of having just a small amount of transcriptome data for disease prediction. Lastly, it highlights key genes and pathways associated with disorder prediction, including immunological, synaptic, and metabolic pathways, recapitulating de novo results from more targeted analyses.
CONCLUSION
Our resource and integrative analyses have uncovered genomic elements and networks in the brain, which in turn have provided insight into the molecular mechanisms underlying psychiatric disorders. Our deep-learning model improves disease risk prediction over traditional approaches and can be extended with additional data types (e.g., microRNA and neuroimaging).
A comprehensive functional genomic resource for the adult human brain.
The resource forms a three-layer pyramid. The bottom layer includes sequencing datasets for traits, such as schizophrenia. The middle layer represents derived datasets, including functional genomic elements and QTLs. The top layer contains integrated models, which link genotypes to phenotypes. DSPN, Deep Structured Phenotype Network; PC1 and PC2, principal components 1 and 2; ref, reference; alt, alternate; H3K27ac, histone H3 acetylation at lysine 27.
Funder
National Institute of Mental Health
Publisher
American Association for the Advancement of Science (AAAS)
Subject
Multidisciplinary
Cited by
659 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|