The complete sequence of a human Y chromosome
Author:
Rhie ArangORCID, Nurk Sergey, Cechova MonikaORCID, Hoyt Savannah J.ORCID, Taylor Dylan J.ORCID, Altemose NicolasORCID, Hook Paul W.ORCID, Koren SergeyORCID, Rautiainen Mikko, Alexandrov Ivan A.ORCID, Allen JamieORCID, Asri MobinORCID, Bzikadze Andrey V., Chen Nae-ChyunORCID, Chin Chen-ShanORCID, Diekhans MarkORCID, Flicek PaulORCID, Formenti GiulioORCID, Fungtammasan ArkarachaiORCID, Giron Carlos GarciaORCID, Garrison ErikORCID, Gershman ArielORCID, Gerton JenniferORCID, Grady Patrick G.S.ORCID, Guarracino AndreaORCID, Haggerty LeanneORCID, Halabian RezaORCID, Hansen Nancy F.ORCID, Harris RobertORCID, Hartley Gabrielle A.ORCID, Harvey William T.ORCID, Haukness MarinaORCID, Heinz JakobORCID, Hourlier ThibautORCID, Hubley Robert M.ORCID, Hunt Sarah E.ORCID, Hwang StephenORCID, Jain MitenORCID, Kesharwani Rupesh K.ORCID, Lewis Alexandra P.ORCID, Li HengORCID, Logsdon Glennis A.ORCID, Lucas Julian K.ORCID, Makalowski WojciechORCID, Markovic Christopher, Martin Fergal J.ORCID, Mc Cartney Ann M.ORCID, McCoy Rajiv C.ORCID, McDaniel JenniferORCID, McNulty Brandy M., Medvedev PaulORCID, Mikheenko AllaORCID, Munson Katherine M.ORCID, Murphy Terence D.ORCID, Olsen Hugh E., Olson Nathan D.ORCID, Paulin Luis F.ORCID, Porubsky DavidORCID, Potapova TamaraORCID, Ryabov FedorORCID, Salzberg Steven L.ORCID, Sauria Michael E.G.ORCID, Sedlazeck Fritz J.ORCID, Shafin KishwarORCID, Shepelev Valery A., Shumate AlainaORCID, Storer Jessica M.ORCID, Surapaneni LikhithaORCID, Taravella Oill Angela M.ORCID, Thibaud-Nissen FrançoiseORCID, Timp WinstonORCID, Tomaszkiewicz MartaORCID, Vollger Mitchell R.ORCID, Walenz Brian P., Watwood Allison C.ORCID, Weissensteiner Matthias H.ORCID, Wenger Aaron M.ORCID, Wilson Melissa A.ORCID, Zarate SamanthaORCID, Zhu Yiming, Zook Justin M.ORCID, Eichler Evan E.ORCID, O’Neill RachelORCID, Schatz Michael C.ORCID, Miga Karen H.ORCID, Makova Kateryna D.ORCID, Phillippy Adam M.ORCID
Abstract
AbstractThe human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure including long palindromes, tandem repeats, and segmental duplications. As a result, more than half of the Y chromosome is missing from the GRCh38 reference sequence and it remains the last human chromosome to be finished. Here, the Telomere-to-Telomere (T2T) consortium presents the complete 62,460,029 base pair sequence of a human Y chromosome from the HG002 genome (T2T-Y) that corrects multiple errors in GRCh38-Y and adds over 30 million base pairs of sequence to the reference, revealing the complete ampliconic structures ofTSPY, DAZ, andRBMY; 42 additional protein-coding genes, mostly from theTSPYgene family; and an alternating pattern of human satellite 1 and 3 blocks in the heterochromatic Yq12 region. We have combined T2T-Y with a prior assembly of the CHM13 genome and mapped available population variation, clinical variants, and functional genomics data to produce a complete and comprehensive reference sequence for all 24 human chromosomes.
Publisher
Cold Spring Harbor Laboratory
Cited by
22 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|