Human whole-exome genotype data for Alzheimer’s disease
-
Published:2024-01-23
Issue:1
Volume:15
Page:
-
ISSN:2041-1723
-
Container-title:Nature Communications
-
language:en
-
Short-container-title:Nat Commun
Author:
Leung Yuk YeeORCID, Naj Adam C.ORCID, Chou Yi-Fan, Valladares OttoORCID, Schmidt MichaelORCID, Hamilton-Nelson Kara, Wheeler Nicholas, Lin HonghuangORCID, Gangadharan Prabhakaran, Qu Liming, Clark KaylynORCID, Kuzma Amanda B.ORCID, Lee Wan-PingORCID, Cantwell Laura, Nicaretta Heather, , van der Lee Sven, English Adam, Kalra Divya, Muzny Donna, Skinner Evette, Doddapeneni Harsha, Dinh Huyen, Hu Jianhong, Santibanez Jireh, Jayaseelan Joy, Worley Kim, Gibbs Richard A., Lee Sandra, Dugan-Perez Shannon, Korchina Viktoriya, Nasser Waleed, Liu Xiuping, Han Yi, Zhu Yiming, Liu Yue, Khan Ziad, Zhu Congcong, Sun Fangui Jenny, Jun Gyungah R., Chung Jaeyoon, Farrell John, Zhang Xiaoling, Banks Eric, Gupta Namrata, Gabriel Stacey, Butkiewicz Mariusz, Benchek Penelope, Smieszek Sandra, Song Yeunjoo, Vardarajan Badri, Reitz Christiane, Reyes-Dumeyer Dolly, Tosto Giuseppe, De Jager Phillip L., Barral Sandra, Ma Yiyi, Beiser Alexa, Liu Ching Ti, Dupuis Josee, Lunetta Kathy, Cupples L. Adrienne, Choi Seung Hoan, Chen Yuning, Mez Jesse, Vanderspek Ashley, Ikram M. Arfan, Ahmad Shahzad, Faber Kelley, Foroud Tatiana, Mlynarski Elisabeth, Schmidt Helena, Schmidt Reinhold, Kunkle Brian, Rajabli Farid, Beecham Gary, Vance Jeffrey M., Adams Larry D., Cuccaro Michael, Mena Pedro, Booth Briana M., Renton Alan, Goate Alison, Marcora Edoardo, Stine Adam, Feolo Michael, Launer Lenore J., Koboldt Daniel C., Wilson Richard K., van Duijn Cornelia, Amin Najaf, Kapoor Manav, Salerno William, Bennett David A., Xia Li Charlie, Malamon John, Mosley Thomas H., Satizabal Claudia, Jan Bressler , Jian Xueqiu, Nato Alejandro Q., Horimoto Andrea R., Wang Bowen, Psaty Bruce, Witten Daniela, Tsuang Debby, Blue Elizabeth, Wijsman Ellen, Sohi Harkirat, Nguyen Hiep, Bis Joshua C., Rice Kenneth, Brown Lisa, Dorschner Michael, Saad Mohamad, Navas Pat, Nafikov Rafael, Thornton Timothy, Day Tyler, Haut Jacob, Sha Jin, Zhang Nancy, Iqbal Taha, Zhao Yi, Below Jennifer E., Larson David E., Appelbaum Elizabeth, Waligorski Jason, Antonacci-Fulton Lucinda, Fulton Robert S., Haines JonathanORCID, Farrer LindsayORCID, Seshadri SudhaORCID, Brkanac Zoran, Cruchaga CarlosORCID, Pericak-Vance MargaretORCID, Mayeux Richard P., Bush William S.ORCID, Destefano Anita, Martin Eden, Schellenberg Gerard D., Wang Li-San
Abstract
AbstractThe heterogeneity of the whole-exome sequencing (WES) data generation methods present a challenge to a joint analysis. Here we present a bioinformatics strategy for joint-calling 20,504 WES samples collected across nine studies and sequenced using ten capture kits in fourteen sequencing centers in the Alzheimer’s Disease Sequencing Project. The joint-genotype called variant-called format (VCF) file contains only positions within the union of capture kits. The VCF was then processed specifically to account for the batch effects arising from the use of different capture kits from different studies. We identified 8.2 million autosomal variants. 96.82% of the variants are high-quality, and are located in 28,579 Ensembl transcripts. 41% of the variants are intronic and 1.8% of the variants are with CADD > 30, indicating they are of high predicted pathogenicity. Here we show our new strategy can generate high-quality data from processing these diversely generated WES samples. The improved ability to combine data sequenced in different batches benefits the whole genomics research community.
Funder
U.S. Department of Health & Human Services | NIH | National Institute on Aging
Publisher
Springer Science and Business Media LLC
Reference31 articles.
1. Bis, J. C. et al. Whole exome sequencing study identifies novel rare and common Alzheimer’s-associated variants involved in immune response and transcriptional regulation. Mol. Psychiatry 25, 1859–1875 (2020). 2. Clark, M. J. et al. Performance comparison of exome DNA sequencing technologies. Nat. Biotechnol. 29, 908–916 (2011). 3. Sulonen, A. M. et al. Comparison of solution-based exome capture methods for next generation sequencing. Genome Biol. 12, R94 (2011). 4. Parla, J. S. et al. A comparative analysis of exome capture. Genome Biol. 12, R97 (2011). 5. Leung, Y. Y. et al. VCPA: genomic variant calling pipeline and data management tool for alzheimer’s disease sequencing project. Bioinformatics 35, 1768–1770 (2019).
|
|