Genotype Imputation and Reference Panel: A Systematic Evaluation

Author:

Bai Wei-Yang,Zhu Xiao-Wei,Cong Pei-Kuan,Zhang Xue-Jun,Richards J Brent,Zheng Hou-Feng

Abstract

AbstractHere, 622 imputations were conducted with 394 customized reference panels for Han Chinese and European populations. Besides validating the fact that the imputation accuracy could always benefit from the increased panel size when the reference panel was population-specific, the results brought two new thoughts as follows. First, when the haplotype size of reference panel was fixed, the imputation accuracy of common and low-frequency variants (MAF>0.5%) decreased while the population-diversity of reference panel increased, but for rare variants (MAF<0.5%), a fraction of diversity (<20%) of panel could improve the imputation accuracy. Second, when the haplotype size of reference panel was increased with extra population-diverse samples, the imputation accuracy of common variants (MAF>5%) for European population could always benefit from the expanding sample size. But for Han Chinese population, the accuracy of all imputed variants reached the highest when reference panel contained a fraction of extra diverse sample (15%∼21%). In addition, we evaluated the existing reference panels such as the HRC and 1000G Phase3 and CONVERGE. For European population, HRC was the best reference panel. For Han Chinese population, we proposed an optimum constituent ratio for the Han Chinese imputation if researchers would like to customize their own sequenced reference panel, but a high quality and large-scale Chinese reference panel was still needed. Our findings could be generalized to the other populations with conservative genome, a tool was provided to investigate other populations of interest (https://github.com/Abyss-bai/reference-panel-reconstruction).Highlights (Key points)A total of 394 reference panels were designed and customized by three strategies, and large-scale genotype imputations were performed with these panels for systematic evaluation in Han Chinese and European populations.The accuracy of imputed variants reached the highest when reference panel contains a fraction of extra diverse sample (15%∼21%) for Han Chinese population, if the haplotype size of the reference panel was increased with extra samples, which is the most common cases.The imputation accuracy showed the different trends between Han Chinese and European populations. In a sense, the European genome may more diverse than Han Chinese genome by itself.Existing reference panels were not the best choice for Chinese imputation, a high quality and large-scale Chinese reference panel was still needed.

Publisher

Cold Spring Harbor Laboratory

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3