Author:
Nikolaou Nikolaos,Salazar Domingo,RaviPrakash Harish,Gonçalves Miguel,Mulla Rob,Burlutskiy Nikolay,Markuzon Natasha,Jacob Etai
Abstract
AbstractThe last decade has seen an unprecedented advance in technologies at the level of high-throughput molecular assays and image capturing and analysis, as well as clinical phenotyping and digitization of patient data. For decades, genotyping (identification of genomic alterations), the casual anchor in biological processes, has been an essential component in interrogating disease progression and a guiding step in clinical decision making. Indeed, survival rates in patients tested with next-generation sequencing have been found to be significantly higher in those who received a genome-guided therapy than in those who did not. Nevertheless, DNA is only a small part of the complex pathophysiology of cancer development and progression. To assess a more complete picture, researchers have been using data taken from multiple modalities, such as transcripts, proteins, metabolites, and epigenetic factors, that are routinely captured for many patients. Multimodal machine learning offers the potential to leverage information across different bioinformatics modalities to improve predictions of patient outcome. Identifying a multiomics data fusion strategy that clearly demonstrates an improved performance over unimodal approaches is challenging, primarily due to increased dimensionality and other factors, such as small sample sizes and the sparsity and heterogeneity of data. Here we present a flexible pipeline for systematically exploring and comparing multiple multimodal fusion strategies. Using multiple independent data sets from The Cancer Genome Atlas, we developed a late fusion strategy that consistently outperformed unimodal models, clearly demonstrating the advantage of a multimodal fusion model.
Publisher
Cold Spring Harbor Laboratory
Reference77 articles.
1. Integration strategies of multi-omics data for machine learning analysis;Comput Struct Biotechnol J,2021
2. Multi-omic and multi-view clustering algorithms: review and cancer benchmark
3. Stahlschmidt SR , Ulfenborg B , Synnergren J . Multimodal deep learning for biomedical data fusion: a review. Brief Bioinform. 2022;23(2).
4. Multimodal Machine Learning: A Survey and Taxonomy
5. Huang Y , Du C , Xue Z , Chen X , Zhao H , Huang L . What makes multimodal learning better than single (provably). In: Advances in Neural Information Processing Systems, Volume 34. Edited by Ranzato M , Beygelzimer A , Dauphin Y , Liang PS , Wortman Vaughan J. San Diego, CA: Neural Information Processing Systems Foundation; 2021: 1-13.