Genome-wide de novo risk score implicates promoter variation in autism spectrum disorder

Author:

An Joon-Yong1ORCID,Lin Kevin2,Zhu Lingxue2ORCID,Werling Donna M.1ORCID,Dong Shan1ORCID,Brand Harrison345,Wang Harold Z.3,Zhao Xuefang345ORCID,Schwartz Grace B.1ORCID,Collins Ryan L.346ORCID,Currall Benjamin B.345,Dastmalchi Claudia1ORCID,Dea Jeanselle1ORCID,Duhn Clif1ORCID,Gilson Michael C.1ORCID,Klei Lambertus7,Liang Lindsay1ORCID,Markenscoff-Papadimitriou Eirene1,Pochareddy Sirisha8ORCID,Ahituv Nadav910ORCID,Buxbaum Joseph D.11121314ORCID,Coon Hilary1516ORCID,Daly Mark J.51718ORCID,Kim Young Shin1,Marth Gabor T.1920ORCID,Neale Benjamin M.51718ORCID,Quinlan Aaron R.161920ORCID,Rubenstein John L.1,Sestan Nenad8ORCID,State Matthew W.110ORCID,Willsey A. Jeremy12122ORCID,Talkowski Michael E.34523,Devlin Bernie7,Roeder Kathryn224,Sanders Stephan J.110ORCID

Affiliation:

1. Department of Psychiatry, UCSF Weill Institute for Neurosciences, University of California, San Francisco, CA, USA.

2. Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA.

3. Center for Genomic Medicine and Department of Neurology, Massachusetts General Hospital, Boston, MA, USA.

4. Department of Neurology, Harvard Medical School, Boston, MA, USA.

5. Program in Medical and Population Genetics and the Stanley Center for Psychiatric Research, Broad Institute, Cambridge, MA, USA.

6. Program in Bioinformatics and Integrative Genomics, Division of Medical Sciences, Harvard Medical School, Boston, MA, USA.

7. Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, PA 15213, USA.

8. Department of Neuroscience and Kavli Institute for Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA.

9. Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA, USA.

10. Institute for Human Genetics, University of California, San Francisco, CA, USA.

11. Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.

12. Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.

13. Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA.

14. Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA.

15. Department of Psychiatry, University of Utah School of Medicine, Salt Lake City, UT, USA.

16. Department of Biomedical Informatics, University of Utah School of Medicine, Salt Lake City, UT, USA.

17. Analytical and Translational Genetics Unit and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA.

18. Department of Medicine, Harvard Medical School, Boston, MA, USA.

19. Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT, USA.

20. USTAR Center for Genetic Discovery, University of Utah School of Medicine, Salt Lake City, UT, USA.

21. Institute for Neurodegenerative Diseases, UCSF Weill Institute for Neurosciences, University of California, San Francisco, CA, USA.

22. Quantitative Biosciences Institute, University of California, San Francisco, CA, USA.

23. Departments of Pathology and Psychiatry, Massachusetts General Hospital, Boston, MA, USA.

24. Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA.

Abstract

INTRODUCTION The DNA of protein-coding genes is transcribed into mRNA, which is translated into proteins. The “coding genome” describes the DNA that contains the information to make these proteins and represents ~1.5% of the human genome. Newly arising de novo mutations (variants observed in a child but not in either parent) in the coding genome contribute to numerous childhood developmental disorders, including autism spectrum disorder (ASD). Discovery of these effects is aided by the triplet code that enables the functional impact of many mutations to be readily deciphered. In contrast, the “noncoding genome” covers the remaining ~98.5% and includes elements that regulate when, where, and to what degree protein-coding genes are transcribed. Understanding this noncoding sequence could provide insights into human disorders and refined control of emerging genetic therapies. Yet little is known about the role of mutations in noncoding regions, including whether they contribute to childhood developmental disorders, which noncoding elements are most vulnerable to disruption, and the manner in which information is encoded in the noncoding genome. RATIONALE Whole-genome sequencing (WGS) provides the opportunity to identify the majority of genetic variation in each individual. By performing WGS on 1902 quartet families including a child affected with ASD, one unaffected sibling control, and their parents, we identified ~67 de novo mutations across each child’s genome. To characterize the functional role of these mutations, we integrated multiple datasets relating to gene function, genes implicated in neurodevelopmental disorders, conservation across species, and epigenetic markers, thereby combinatorially defining 55,143 categories. The scope of the problem—testing for an excess of de novo mutations in cases relative to controls for each category—is challenging because there are more categories than families. RESULTS Comparing cases to controls, we observed an excess of de novo mutations in cases in individual categories in the coding genome but not in the noncoding genome. To overcome the challenge of detecting noncoding association, we used machine learning tools to develop a de novo risk score to look for an excess of de novo mutations across multiple categories. This score demonstrated a contribution to ASD risk from coding mutations and a weaker, but significant, contribution from noncoding mutations. This noncoding signal was driven by mutations in the promoter region, defined as the 2000 nucleotides upstream of the transcription start site (TSS) where mRNA synthesis starts. The strongest promoter signals were defined by conservation across species and transcription factor binding sites. Well-defined promoter elements (e.g., TATA-box) are usually observed within 80 nucleotides of the TSS; however, the strongest ASD association was observed distally, 750 to 2000 nucleotides upstream of the TSS. CONCLUSION We conclude that de novo mutations in the noncoding genome contribute to ASD. The clearest evidence of noncoding ASD association came from mutations at evolutionarily conserved nucleotides in the promoter region. The enrichment for transcription factor binding sites, primarily in the distal promoter, suggests that these mutations may disrupt gene transcription via their interaction with enhancer elements in the promoter region, rather than interfering with transcriptional initiation directly. Promoter regions in autism. De novo mutations from 1902 quartet families are assigned to 55,143 annotation categories, which are each assessed for autism spectrum disorder (ASD) association by comparing mutation counts in cases and sibling controls. A de novo risk score demonstrated a noncoding contribution to ASD driven by promoter mutations, especially at sites conserved across species, in the distal promoter or targeted by transcription factors.

Funder

National Institute of Mental Health

Simons Foundation

National Human Genome Research Institute

National Institute of General Medical Sciences

Eunice Kennedy Shriver National Institute of Child Health and Human Development

Beatrice and Samuel A. Seaver Foundation

Publisher

American Association for the Advancement of Science (AAAS)

Subject

Multidisciplinary

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3