Telomere-to-telomere assembly of a complete human X chromosome
Author:
Miga Karen H.ORCID, Koren SergeyORCID, Rhie ArangORCID, Vollger Mitchell R.ORCID, Gershman Ariel, Bzikadze Andrey, Brooks Shelise, Howe Edmund, Porubsky David, Logsdon Glennis A.ORCID, Schneider Valerie A., Potapova Tamara, Wood Jonathan, Chow William, Armstrong JoelORCID, Fredrickson Jeanne, Pak Evgenia, Tigyi Kristof, Kremitzki Milinn, Markovic Christopher, Maduro Valerie, Dutra Amalia, Bouffard Gerard G.ORCID, Chang Alexander M., Hansen Nancy F.ORCID, Wilfert Amy B., Thibaud-Nissen FrançoiseORCID, Schmitt Anthony D., Belton Jon-Matthew, Selvaraj Siddarth, Dennis Megan Y.ORCID, Soto Daniela C.ORCID, Sahasrabudhe Ruta, Kaya Gulhan, Quick Josh, Loman Nicholas J., Holmes Nadine, Loose MatthewORCID, Surti UrvashiORCID, Risques Rosa ana, Graves Lindsay Tina A., Fulton Robert, Hall IraORCID, Paten BenedictORCID, Howe Kerstin, Timp WinstonORCID, Young Alice, Mullikin James C.ORCID, Pevzner Pavel A.ORCID, Gerton Jennifer L.ORCID, Sullivan Beth A.ORCID, Eichler Evan E.ORCID, Phillippy Adam M.ORCID
Abstract
AbstractAfter two decades of improvements, the current human reference genome (GRCh38) is the most accurate and complete vertebrate genome ever produced. However, no single chromosome has been finished end to end, and hundreds of unresolved gaps persist1,2. Here we present a human genome assembly that surpasses the continuity of GRCh382, along with a gapless, telomere-to-telomere assembly of a human chromosome. This was enabled by high-coverage, ultra-long-read nanopore sequencing of the complete hydatidiform mole CHM13 genome, combined with complementary technologies for quality improvement and validation. Focusing our efforts on the human X chromosome3, we reconstructed the centromeric satellite DNA array (approximately 3.1 Mb) and closed the 29 remaining gaps in the current reference, including new sequences from the human pseudoautosomal regions and from cancer-testis ampliconic gene families (CT-X and GAGE). These sequences will be integrated into future human reference genome releases. In addition, the complete chromosome X, combined with the ultra-long nanopore data, allowed us to map methylation patterns across complex tandem repeats and satellite arrays. Our results demonstrate that finishing the entire human genome is now within reach, and the data presented here will facilitate ongoing efforts to complete the other human chromosomes.
Publisher
Springer Science and Business Media LLC
Subject
Multidisciplinary
Reference67 articles.
1. Jain, M. et al. Nanopore sequencing and assembly of a human genome with ultra-long reads. Nat. Biotechnol. 36, 338–345 (2018). 2. Schneider, V. A. et al. Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome Res. 27, 849–864 (2017). 3. Ross, M. T. et al. The DNA sequence of the human X chromosome. Nature 434, 325–337 (2005). 4. Mefford, H. C. & Eichler, E. E. Duplication hotspots, rare genomic disorders, and common disease. Curr. Opin. Genet. Dev. 19, 196–204 (2009). 5. Langley, S. A., Miga, K. H., Karpen, G. H. & Langley, C. H. Haplotypes spanning centromeric regions reveal persistence of large blocks of archaic DNA. eLife 8, e42989 (2019).
Cited by
561 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|