Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program
Author:
Taliun Daniel, , Harris Daniel N., Kessler Michael D., Carlson Jedidiah, Szpiech Zachary A., Torres Raul, Taliun Sarah A. GaglianoORCID, Corvelo AndréORCID, Gogarten Stephanie M.ORCID, Kang Hyun Min, Pitsillides Achilleas N., LeFaive JonathonORCID, Lee Seung-beenORCID, Tian Xiaowen, Browning Brian L., Das SayantanORCID, Emde Anne-Katrin, Clarke Wayne E., Loesch Douglas P., Shetty Amol C.ORCID, Blackwell Thomas W., Smith Albert V.ORCID, Wong Quenna, Liu Xiaoming, Conomos Matthew P.ORCID, Bobo Dean M., Aguet François, Albert Christine, Alonso Alvaro, Ardlie Kristin G., Arking Dan E., Aslibekyan Stella, Auer Paul L., Barnard John, Barr R. Graham, Barwick Lucas, Becker Lewis C., Beer Rebecca L., Benjamin Emelia J., Bielak Lawrence F., Blangero John, Boehnke Michael, Bowden Donald W., Brody Jennifer A., Burchard Esteban G., Cade Brian E., Casella James F., Chalazan Brandon, Chasman Daniel I., Chen Yii-Der Ida, Cho Michael H., Choi Seung Hoan, Chung Mina K., Clish Clary B., Correa Adolfo, Curran Joanne E., Custer Brian, Darbar Dawood, Daya Michelle, de Andrade Mariza, DeMeo Dawn L., Dutcher Susan K., Ellinor Patrick T., Emery Leslie S., Eng Celeste, Fatkin Diane, Fingerlin Tasha, Forer Lukas, Fornage Myriam, Franceschini Nora, Fuchsberger Christian, Fullerton Stephanie M., Germer Soren, Gladwin Mark T., Gottlieb Daniel J., Guo Xiuqing, Hall Michael E., He Jiang, Heard-Costa Nancy L., Heckbert Susan R., Irvin Marguerite R., Johnsen Jill M., Johnson Andrew D., Kaplan Robert, Kardia Sharon L. R., Kelly Tanika, Kelly Shannon, Kenny Eimear E., Kiel Douglas P., Klemmer Robert, Konkle Barbara A., Kooperberg Charles, Köttgen Anna, Lange Leslie A., Lasky-Su Jessica, Levy Daniel, Lin Xihong, Lin Keng-Han, Liu Chunyu, Loos Ruth J. F., Garman Lori, Gerszten Robert, Lubitz Steven A., Lunetta Kathryn L., Mak Angel C. Y., Manichaikul Ani, Manning Alisa K., Mathias Rasika A., McManus David D., McGarvey Stephen T., Meigs James B., Meyers Deborah A., Mikulla Julie L., Minear Mollie A., Mitchell Braxton D., Mohanty Sanghamitra, Montasser May E., Montgomery Courtney, Morrison Alanna C., Murabito Joanne M., Natale Andrea, Natarajan Pradeep, Nelson Sarah C., North Kari E., O’Connell Jeffrey R., Palmer Nicholette D., Pankratz Nathan, Peloso Gina M., Peyser Patricia A., Pleiness Jacob, Post Wendy S., Psaty Bruce M., Rao D. C., Redline Susan, Reiner Alexander P., Roden Dan, Rotter Jerome I., Ruczinski Ingo, Sarnowski Chloé, Schoenherr Sebastian, Schwartz David A., Seo Jeong-Sun, Seshadri Sudha, Sheehan Vivien A., Sheu Wayne H., Shoemaker M. Benjamin, Smith Nicholas L., Smith Jennifer A., Sotoodehnia Nona, Stilp Adrienne M., Tang Weihong, Taylor Kent D., Telen Marilyn, Thornton Timothy A., Tracy Russell P., Van Den Berg David J., Vasan Ramachandran S., Viaud-Martinez Karine A., Vrieze Scott, Weeks Daniel E., Weir Bruce S., Weiss Scott T., Weng Lu-Chen, Willer Cristen J., Zhang Yingze, Zhao Xutong, Arnett Donna K., Ashley-Koch Allison E., Barnes Kathleen C., Boerwinkle Eric, Gabriel Stacey, Gibbs Richard, Rice Kenneth M., Rich Stephen S., Silverman Edwin K., Qasba Pankaj, Gan Weiniu, Papanicolaou George J., Nickerson Deborah A., Browning Sharon R.ORCID, Zody Michael C., Zöllner Sebastian, Wilson James G., Cupples L. AdrienneORCID, Laurie Cathy C.ORCID, Jaquish Cashell E.ORCID, Hernandez Ryan D.ORCID, O’Connor Timothy D.ORCID, Abecasis Gonçalo R.ORCID
Abstract
AbstractThe Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases. The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data. The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes)1. In the first 53,831 TOPMed samples, we detected more than 400 million single-nucleotide and insertion or deletion variants after alignment with the reference genome. Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci. Among the more than 400 million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals). These rare variants provide insights into mutational processes and recent human evolutionary history. The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 0.01%.
Publisher
Springer Science and Business Media LLC
Subject
Multidisciplinary
Reference114 articles.
1. Mailman, M. D. et al. The NCBI dbGaP database of genotypes and phenotypes. Nat. Genet. 39, 1181–1186 (2007). 2. Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018). 3. Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016). 4. Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 431–443 (2020). 5. Bodea, C. A. et al. A method to exploit the structure of genetic ancestry space to enhance case–control studies. Am. J. Hum. Genet. 98, 857–868 (2016).
Cited by
1251 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|