High-resolution portability of 245 polygenic scores when derived and applied in the same cohort

Author:

Privé FlorianORCID,Aschard HuguesORCID,Carmi ShaiORCID,Folkersen LasseORCID,Hoggart Clive,O’Reilly Paul F.ORCID,Vilhjálmsson Bjarni J.ORCID

Abstract

AbstractThe low portability of polygenic scores (PGS) across global populations is a major concern that must be addressed before PGS can be used for everyone in the clinic. Indeed, prediction accuracy has been shown to decay as a function of the genetic distance between the training and test cohorts. However, such cohorts differ not only in their genetic distance but also in their geographical distance and their data collection and assaying, conflating multiple factors. In this study, we examine the extent to which PGS are transferable between ancestries by deriving polygenic scores for 245 curated traits from the UK Biobank data and applying them in nine ancestry groups from the same cohort. By restricting both training and testing to the UK Biobank data, we reduce the risk of environmental and genotyping confounding from using different cohorts. We define the nine ancestry groups at a high-resolution, country-specific level, based on a simple, robust and effective method that we introduce here. We then apply two different predictive methods to derive polygenic scores for all 245 phenotypes, and show a systematic and dramatic reduction in portability of PGS trained in the inferred ancestral UK population and applied to the inferred ancestral Polish - Italian - Iranian - Indian - Chinese - Caribbean - Nigerian - Ashkenazi populations, respectively. These analyses, performed at a finer scale than the usual continental scale, demonstrate that prediction already drops off within European ancestries and reduces globally in proportion to PC distance, even when all individuals reside in the same country and are genotyped and phenotyped as part of the same cohort. Our study provides high-resolution and robust insights into the PGS portability problem.

Publisher

Cold Spring Harbor Laboratory

Reference53 articles.

1. Accurate and robust genomic prediction of celiac disease using statistical learning;PLoS genetics,2014

2. Abraham, G. , Qiu, Y. , and Inouye, M. (2017). FlashPCA2: principal component analysis of biobank-scale genotype datasets. Bioinformatics.

3. Albiñana, C. , Grove, J. , McGrath, J. J. , Agerbo, E. , Wray, N. R. , Werge, T. , Børglum, A. D. , Mortensen, P. B. , Privé, F. , and Vilhjálmsson, B. J. (2020). Leveraging both individual-level genetic data and gwas summary statistics increases polygenic prediction. bioRxiv.

4. Fast model-based estimation of ancestry in unrelated individuals

5. No evidence from genome-wide data of a khazar origin for the ashkenazi jews;Human biology,2013

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3