Abstract
AbstractThe rapid growth of multi-omics datasets, in addition to the wealth of existing biological prior knowledge, necessitates the development of effective methods for their integration. Such methods are essential for building predictive models and identifying disease-related molecular markers. We propose a framework for supervised integration of multi-omics data with biological priors represented as knowledge graphs. Our framework is based on the use of graph neural networks (GNNs) to model the relationships among features from high-dimensional ‘omics data and set transformers to integrate low dimensional representations of ‘omics features. Furthermore, our framework incorporates explainability methods to elucidate important biomarkers and extract interaction relationships between biological quantities of interest. We demonstrate the effectiveness of our approach by applying it to Alzheimer’s disease (AD) multi-omics data from the ROSMAP cohort, showing that the integration of transcriptomics and proteomics data with AD biological domain network priors improves the prediction accuracy of AD status and highlights robust AD biomarkers.
Publisher
Cold Spring Harbor Laboratory