Abstract
AbstractBackgroundTwo decades of genome-wide association studies (GWAS) have led to the fast-growing application of polygenic risk prediction (PRS). However, due to population structure and evolutionary path differences, the PRS substrate derived mostly from studies of European ancestry does not work equally well for other ancestries. There is an association between prediction accuracy decay and individual genetic distance (GD) to the genetic centers (GC) of various populations.ObjectivesTo develop a new PRS method and software that utilizes individual GD to improve PRS risk prediction accuracy, especially for non-European populations.MethodWe hypothesize that adding a GD-based weight into PRS methods would enhance its risk prediction performance, particularly for minority groups. We explore the GD first by principal components (PC) and then by phylogenetic tree structures. Building on top of an emerging software (PRS-CSx) that achieves high prediction accuracy across multiple-ancestries, we present PGS-GRID, where “GRID” stands for “GeneticReference based onIndividualDistance”.ResultsWe developed a preliminary version of PRS-GRID and pilot tested its prediction performance for a classic quantitative trait (e.g., height) and a disease trait (e.g., type-2 diabetes). We found slight but noticeable improvement of risk prediction in minority populations. We further explored a random forest approach so that the performance of PRS-GRID could be clearly explained, which is a key step for PRS to be used in clinical and public health practice.ConclusionsThe PRS-GRID philosophy and method represent an innovative and significant advancement in the field of polygenic risk prediction. Our work provides a foundation for future research and clinical applications aimed at reducing health disparities and improving population health through personalized medicine.
Publisher
Cold Spring Harbor Laboratory