Deep multi-omics integration by learning correlation-maximizing representation identifies prognostically stratified cancer subtypes

Author:

Ji Yanrong1ORCID,Dutta Pratik2,Davuluri Ramana2ORCID

Affiliation:

1. Division of Health and Biomedical Informatics, Department of Preventive Medicine, Northwestern University Feinberg School of Medicine , Chicago, IL 60611, USA

2. Department of Biomedical Informatics, Stony Brook Cancer Center, Stony Brook Medicine, Stony Brook University , Stony Brook, NY 11794, USA

Abstract

Abstract Motivation Molecular subtyping by integrative modeling of multi-omics and clinical data can help the identification of robust and clinically actionable disease subgroups; an essential step in developing precision medicine approaches. Results We developed a novel outcome-guided molecular subgrouping framework, called Deep Multi-Omics Integrative Subtyping by Maximizing Correlation (DeepMOIS-MC), for integrative learning from multi-omics data by maximizing correlation between all input -omics views. DeepMOIS-MC consists of two parts: clustering and classification. In the clustering part, the preprocessed high-dimensional multi-omics views are input into two-layer fully connected neural networks. The outputs of individual networks are subjected to Generalized Canonical Correlation Analysis loss to learn the shared representation. Next, the learned representation is filtered by a regression model to select features that are related to a covariate clinical variable, for example, a survival/outcome. The filtered features are used for clustering to determine the optimal cluster assignments. In the classification stage, the original feature matrix of one of the -omics view is scaled and discretized based on equal frequency binning, and then subjected to feature selection using RandomForest. Using these selected features, classification models (for example, XGBoost model) are built to predict the molecular subgroups that were identified at clustering stage. We applied DeepMOIS-MC on lung and liver cancers, using TCGA datasets. In comparative analysis, we found that DeepMOIS-MC outperformed traditional approaches in patient stratification. Finally, we validated the robustness and generalizability of the classification models on independent datasets. We anticipate that the DeepMOIS-MC can be adopted to many multi-omics integrative analyses tasks. Availability and implementation Source codes for PyTorch implementation of DGCCA and other DeepMOIS-MC modules are available at GitHub (https://github.com/duttaprat/DeepMOIS-MC). Supplementary information Supplementary data are available at Bioinformatics Advances online.

Funder

National Library of Medicine

National Institutes of Health

Publisher

Oxford University Press (OUP)

Subject

Computer Science Applications,Genetics,Molecular Biology,Structural Biology

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3