Abstract
AbstractPolygenic risk scores (PRS) enhance population risk stratification and advance personalized medicine, yet existing methods face a tradeoff between predictive power and computational efficiency. We introduce ALL-Sum, a fast and scalable PRS method that combines an efficient summary statistic-based L0L2penalized regression algorithm with an ensembling step that aggregates estimates from different tuning parameters for improved prediction performance. In extensive large-scale simulations across a wide range of polygenicity and genome-wide association studies (GWAS) sample sizes, ALL-Sum consistently outperforms popular alternative methods in terms of prediction accuracy, runtime, and memory usage. We analyze 27 published GWAS summary statistics for 11 complex traits from 9 reputable data sources, including the Global Lipids Genetics Consortium, Breast Cancer Association Consortium, and FinnGen, evaluated using individual-level UKBB data. ALL-Sum achieves the highest accuracy for most traits, particularly for GWAS with large sample sizes. We provide ALL-Sum as a user-friendly command-line software with pre-computed reference data for streamlined user-end analysis.
Publisher
Cold Spring Harbor Laboratory
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献