Abstract
AbstractIntegrating multi-omics data into predictive models has the potential to enhance accuracy, which is essential for precision medicine. In this study, we developed interpretable predictive models for multi-omics data by employing neural networks informed by prior biological knowledge, referred to as visible networks. These neural networks offer insights into the decision-making process and can unveil novel perspectives on the underlying biological mechanisms associated with traits and complex diseases. We tested the performance, interpretability, and generalizability for inferring smoking status, subject age and LDL levels using genome-wide RNA-expression and CpG methylation data from blood of the BIOS consortium(4 population cohorts, N_total=2940). In a cohort-wise cross validation setting, the consistency of the diagnostic performance and interpretation was assessed.Performance was consistently high for predicting smoking status with an overall mean AUC of 0.95 (95% CI, 0.90 - 1.00) and interpretation revealed the involvement of well-replicated genes such asAHRR, GPR15andLRRN3. LDL-level predictions only generalized in a single cohort with an R2of 0.07 (95% CI, 0.05 - 0.08). Age was infered with a mean error of 5.16 (95% CI, 3.97 - 6.35) years with the genesCOL11A2, AFAP1, OTUD7A, PTPRN2, ADARB2andCD34consistently predictive. In general, we found that using multi-omics networks improved performance, stability and generalizability compared to interpretable single omic networks.We believe that visible neural networks have great potential for multi-omics analysis; they combine multi-omic data elegantly, are interpretable, and generalize well to data from different cohorts.
Publisher
Cold Spring Harbor Laboratory
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献