Affiliation:
1. Nanning Normal University
2. Fudan University
Abstract
Abstract
Background
Complex disease classification is an important part of the complex disease diagnosis and personalized treatment process. It has been shown that the integration of multi-omics data can analyze and classify complex diseases more accurately, because multi-omics data are highly correlated with the onset and progression of various diseases and can provide comprehensive and complementary information about a disease. However, multi-omics data of complex diseases are usually characterized by high imbalance, scale variation, high data heterogeneity and high noise interference, which pose great challenges to multi-omics integration methods.
Results
We propose a novel multi-omics data integration learning model called MODILM, to obtain more important and complementary information for complex disease classification from multiple omics data. Specifically, MODILM first initially constructs a similarity network for each omics data using cosine similarity measure, then learns the sample-specific features and intra-association features of single-omics data from the similarity networks using Graph Attention Networks, then maps them uniformly to a new feature space to further strengthen and extract high-level omics-specific features of the omics data using Multilayer Perceptron networks. MODILM then uses a View Correlation Discovery Network to fuse the high-level omics-specific features extracted from each omics data and further learn cross-omics features in the label space, providing unique class-level distinctiveness to classify complex diseases. We conducted extensive experiments on six benchmark datasets having the miRNA expression data, mRNA and DNA methylation data to demonstrate the superiority of our MODILM. The experimental results show that MODILM outperforms state-of-the-art methods, effectively improving the accuracy of complex disease classification.
Conclusions
Our MODILM provides a more competitive way to extract and integrate important and complementary information from multiple omics data, providing a very promising tool for supporting decision making for clinical diagnosis.
Publisher
Research Square Platform LLC
Reference47 articles.
1. Review of applications of high-throughput sequencing in personalized medicine: barriers and facilitators of future progress in research and clinical application;Lightbody G;Brief Bioinform,2019
2. Multi-omics approach to precision medicine for immune-mediated diseases;Ota M;Inflamm Regener,2021
3. Dunkler D, Sánchez-Cabo F, Heinze G. Statistical Analysis Principles for Omics Data. In: Mayer B, editor. Bioinformatics for Omics Data. Methods in Molecular Biology. Volume 719. Humana Press; 2011. pp. 113–31. 10.1007/978-1-61779-027-0_5.
4. Smolinska A, Hauschild AC, Fijten RRR, Dallinga JW, Baumbach J, van Schooten FJ. J Breath Res. 2014;8(2):027105. 10.1088/1752-7155/8/2/027105. Current breathomics—a review on data pre-processing techniques and machine learning in metabolomics breath analysis.
5. Network approaches to systems biology analysis of complex disease: integrative methods for multi-omics data;Yan J;Brief Bioinform,2018