Abstract
ABSTRACTPolygenic risk scores (PRSs) are expected to play a critical role in achieving precision medicine. Currently, PRS predictors are generally based on linear models using summary statistics, and more recently individual-level data. However, these predictors mainly capture additive relationships and are limited in data modalities they can use. Here, we developed a deep learning framework (EIR) for PRS prediction which includes a model, genome-local-net (GLN), specifically designed for large scale genomics data. The framework supports multi-task (MT) learning, automatic integration of other clinical and biochemical data, and model explainability. When applied to individual level data in the UK Biobank, we found that GLN outperformed LASSO for a wide range of diseases and in particularly autoimmune diseases. Furthermore, we show that this was likely due to modelling epistasis, and we showcase this by identifying widespread epistasis for Type 1 Diabetes. Furthermore, we trained PRS by integrating genotype, blood, urine and anthropometrics and found that this improved performance for 93% of 290 diseases and disorders considered. Finally, we found that including genotype data provided better calibrated PRS models compared to using measurements alone. EIR is available at https://github.com/arnor-sigurdsson/EIR.
Publisher
Cold Spring Harbor Laboratory
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献