Improving genetic risk prediction across diverse population by disentangling ancestry representations-Reference-Cited by-同舟云学术

Improving genetic risk prediction across diverse population by disentangling ancestry representations

Published:2023-09-22 Issue:1 Volume:6 Page:
ISSN:2399-3642
Container-title:Communications Biology
language:en
Short-container-title:Commun Biol

Author:

Gyawali Prashnna K.^ORCID,Le Guen Yann^ORCID,Liu Xiaoxia,Belloy Michael E.,Tang Hua^ORCID,Zou James^ORCID,He Zihuai^ORCID

Abstract

AbstractRisk prediction models using genetic data have seen increasing traction in genomics. However, most of the polygenic risk models were developed using data from participants with similar (mostly European) ancestry. This can lead to biases in the risk predictors resulting in poor generalization when applied to minority populations and admixed individuals such as African Americans. To address this issue, largely due to the prediction models being biased by the underlying population structure, we propose a deep-learning framework that leverages data from diverse population and disentangles ancestry from the phenotype-relevant information in its representation. The ancestry disentangled representation can be used to build risk predictors that perform better across minority populations. We applied the proposed method to the analysis of Alzheimer’s disease genetics. Comparing with standard linear and nonlinear risk prediction methods, the proposed method substantially improves risk prediction in minority populations, including admixed individuals, without needing self-reported ancestry information.

Funder

U.S. Department of Health & Human Services | NIH | National Institute on Aging

Publisher

Springer Science and Business Media LLC

Subject

General Agricultural and Biological Sciences,General Biochemistry, Genetics and Molecular Biology,Medicine (miscellaneous)

Link

https://www.nature.com/articles/s42003-023-05352-6.pdf

Reference47 articles.

1. Zhang, Q. et al. Risk prediction of late-onset Alzheimer’s disease implies an oligogenic architecture. Nat. Commun. 11, 1–11 (2020).

2. Escott-Price, V., Shoai, M., Pither, R., Williams, J. & Hardy, J. Polygenic score prediction captures nearly all common genetic risk for Alzheimer’s disease. Neurobiol. Aging 49, 214.e7 (2017).

3. Leonenko, G. et al. Identifying individuals with high risk of Alzheimer’s disease using polygenic risk scores. Nat. Commun. 12, 4506 (2021).

4. Squillario, M. et al. A telescope GWAS analysis strategy, based on SNPs-genes-pathways ensamble and on multivariate algorithms, to characterize late onset Alzheimer’s disease. Sci. Rep. 10, 1–12 (2020).

5. Jo, T., Nho, K., Bice, P. & Saykin, A. J. Deep learning-based identification of genetic variants: Application to Alzheimer’s disease classification. Brief Bioinform. 23, bbac022 (2022).

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Machine learning methods applied to classify complex diseases using genomic data;2024-03-20

2. Highly parameterized polygenic scores tend to overfit to population stratification via random effects;2024-01-29

3. Validation of a Community-Based Approach Toward Personalized Dementia Risk Reduction: The Kimel Family Centre for Brain Health and Wellness;The Journal of Prevention of Alzheimer's Disease;2024