The complete and fully-phased diploid genome of a male Han Chinese

Author:

Yang ChentaoORCID,Zhou Yang,Song YanniORCID,Wu DongyaORCID,Zeng YanORCID,Nie Lei,Liu PanhongORCID,Zhang Shilong,Chen GuangjiORCID,Xu Jinjin,Zhou Hongling,Zhou LongORCID,Qian Xiaobo,Liu Chenlu,Tan Shangjin,Zhou Chengran,Dai WeiORCID,Xu MengyangORCID,Qi Yanwei,Wang Xiaobo,Guo LidongORCID,Fan Guangyi,Wang AijunORCID,Deng Yuan,Zhang Yong,Jin Jiazheng,He Yunqiu,Guo Chunxue,Guo GuojiORCID,Zhou Qing,Xu XunORCID,Yang HuanmingORCID,Wang Jian,Xu ShuhuaORCID,Mao Yafei,Jin XinORCID,Ruan JueORCID,Zhang GuojieORCID

Abstract

AbstractSince the release of the complete human genome, the priority of human genomic study has now been shifting towards closing gaps in ethnic diversity. Here, we present a fully phased and well-annotated diploid human genome from a Han Chinese male individual (CN1), in which the assemblies of both haploids achieve the telomere-to-telomere (T2T) level. Comparison of this diploid genome with the CHM13 haploid T2T genome revealed significant variations in the centromere. Outside the centromere, we discovered 11,413 structural variations, including numerous novel ones. We also detected thousands of CN1 alleles that have accumulated high substitution rates and a few that have been under positive selection in the East Asian population. Further, we found that CN1 outperforms CHM13 as a reference genome in mapping and variant calling for the East Asian population owing to the distinct structural variants of the two references. Comparison of SNP calling for a large cohort of 8869 Chinese genomes using CN1 and CHM13 as reference respectively showed that the reference bias profoundly impacts rare SNP calling, with nearly 2 million rare SNPs miss-called with different reference genomes. Finally, applying the CN1 as a reference, we discovered 5.80 Mb and 4.21 Mb putative introgression sequences from Neanderthal and Denisovan, respectively, including many East Asian specific ones undetected using CHM13 as the reference. Our analyses reveal the advances of using CN1 as a reference for population genomic studies and paleo-genomic studies. This complete genome will serve as an alternative reference for future genomic studies on the East Asian population.

Funder

International Institutes of Medicine at Yiwu and Kunpeng Fellowship

National Key Research and Development Project Program of China

Publisher

Springer Science and Business Media LLC

Subject

Cell Biology,Molecular Biology

Cited by 7 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3