Towards complete and error-free genome assemblies of all vertebrate species

Author:

Rhie ArangORCID,McCarthy Shane A.,Fedrigo OlivierORCID,Damas Joana,Formenti GiulioORCID,Koren SergeyORCID,Uliano-Silva MarcelaORCID,Chow William,Fungtammasan Arkarachai,Kim Juwan,Lee Chul,Ko Byung June,Chaisson Mark,Gedman Gregory L.,Cantin Lindsey J.,Thibaud-Nissen FrancoiseORCID,Haggerty Leanne,Bista Iliana,Smith Michelle,Haase Bettina,Mountcastle JacquelynORCID,Winkler SylkeORCID,Paez Sadye,Howard Jason,Vernes Sonja C.ORCID,Lama Tanya M.,Grutzner FrankORCID,Warren Wesley C.,Balakrishnan Christopher N.ORCID,Burt DaveORCID,George Julia M.ORCID,Biegler Matthew T.,Iorns David,Digby AndrewORCID,Eason Daryl,Robertson BruceORCID,Edwards TaylorORCID,Wilkinson Mark,Turner George,Meyer AxelORCID,Kautt Andreas F.ORCID,Franchini PaoloORCID,Detrich H. WilliamORCID,Svardal Hannes,Wagner Maximilian,Naylor Gavin J. P.,Pippel MartinORCID,Malinsky MilanORCID,Mooney Mark,Simbirsky Maria,Hannigan Brett T.,Pesout Trevor,Houck Marlys,Misuraca Ann,Kingan Sarah B.ORCID,Hall RichardORCID,Kronenberg Zev,Sović Ivan,Dunn ChristopherORCID,Ning Zemin,Hastie Alex,Lee JoyceORCID,Selvaraj Siddarth,Green Richard E.,Putnam Nicholas H.ORCID,Gut IvoORCID,Ghurye Jay,Garrison Erik,Sims Ying,Collins Joanna,Pelan Sarah,Torrance James,Tracey AlanORCID,Wood Jonathan,Dagnew Robel E.,Guan DengfengORCID,London Sarah E.ORCID,Clayton David F.ORCID,Mello Claudio V.ORCID,Friedrich Samantha R.ORCID,Lovell Peter V.,Osipova EkaterinaORCID,Al-Ajli Farooq O.ORCID,Secomandi SimonaORCID,Kim HeebalORCID,Theofanopoulou ConstantinaORCID,Hiller Michael,Zhou YangORCID,Harris Robert S.,Makova Kateryna D.,Medvedev Paul,Hoffman Jinna,Masterson Patrick,Clark Karen,Martin FergalORCID,Howe Kevin,Flicek PaulORCID,Walenz Brian P.ORCID,Kwak Woori,Clawson Hiram,Diekhans MarkORCID,Nassar Luis,Paten BenedictORCID,Kraus Robert H. S.,Crawford Andrew J.ORCID,Gilbert M. Thomas P.ORCID,Zhang GuojieORCID,Venkatesh ByrappaORCID,Murphy Robert W.,Koepfli Klaus-Peter,Shapiro BethORCID,Johnson Warren E.ORCID,Di Palma Federica,Marques-Bonet TomasORCID,Teeling Emma C.ORCID,Warnow Tandy,Graves Jennifer Marshall,Ryder Oliver A.ORCID,Haussler DavidORCID,O’Brien Stephen J.,Korlach JonasORCID,Lewin Harris A.ORCID,Howe KerstinORCID,Myers Eugene W.ORCID,Durbin RichardORCID,Phillippy Adam M.ORCID,Jarvis Erich D.ORCID

Abstract

AbstractHigh-quality and complete reference genome assemblies are fundamental for the application of genomics to biology, disease, and biodiversity conservation. However, such assemblies are available for only a few non-microbial species1–4. To address this issue, the international Genome 10K (G10K) consortium5,6 has worked over a five-year period to evaluate and develop cost-effective methods for assembling highly accurate and nearly complete reference genomes. Here we present lessons learned from generating assemblies for 16 species that represent six major vertebrate lineages. We confirm that long-read sequencing technologies are essential for maximizing genome quality, and that unresolved complex repeats and haplotype heterozygosity are major sources of assembly error when not handled correctly. Our assemblies correct substantial errors, add missing sequence in some of the best historical reference genomes, and reveal biological discoveries. These include the identification of many false gene duplications, increases in gene sizes, chromosome rearrangements that are specific to lineages, a repeated independent chromosome breakpoint in bat genomes, and a canonical GC-rich pattern in protein-coding genes and their regulatory regions. Adopting these lessons, we have embarked on the Vertebrate Genomes Project (VGP), an international effort to generate high-quality, complete reference genomes for all of the roughly 70,000 extant vertebrate species and to help to enable a new era of discovery across the life sciences.

Publisher

Springer Science and Business Media LLC

Subject

Multidisciplinary

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3