Affiliation:
1. Intelligent Data Center, School of Mathematics, Sun Yat-Sen University, Guangzhou, 510275, China
Abstract
Abstract
The discovery of cancer subtypes has become much-researched topic in oncology. Dividing cancer patients into subtypes can provide personalized treatments for heterogeneous patients. High-throughput technologies provide multiple omics data for cancer subtyping. Integration of multi-view data is used to identify cancer subtypes in many computational methods, which obtain different subtypes for the same cancer, even using the same multi-omics data. To a certain extent, these subtypes from distinct methods are related, which may have certain guiding significance for cancer subtyping. It is a challenge to effectively utilize the valuable information of distinct subtypes to produce more accurate and reliable subtypes. A weighted ensemble sparse latent representation (subtype-WESLR) is proposed to detect cancer subtypes on heterogeneous omics data. Using a weighted ensemble strategy to fuse base clustering obtained by distinct methods as prior knowledge, subtype-WESLR projects each sample feature profile from each data type to a common latent subspace while maintaining the local structure of the original sample feature space and consistency with the weighted ensemble and optimizes the common subspace by an iterative method to identify cancer subtypes. We conduct experiments on various synthetic datasets and eight public multi-view datasets from The Cancer Genome Atlas. The results demonstrate that subtype-WESLR is better than competing methods by utilizing the integration of base clustering of exist methods for more precise subtypes.
Funder
National Natural Science Foundation of China
Publisher
Oxford University Press (OUP)
Subject
Molecular Biology,Information Systems
Reference41 articles.
1. Integrative methods for analysing big data in precision medicine;Gligorijevic;Proteomics,2016
2. Comprehensive molecular characterization of clear cell renal cell carcinoma;Cancer Genome Atlas Research Network;Nature,2016
3. Data integration in genetics and genomics: methods and challenges;Hamid;Hum Genomics Proteomics,2009
4. Data integration in the era of omics: current and future challenges;Gomez-Cabrero;BMC Syst Biol,2014
5. More is better: recent progress in multi-omics data integration methods;Huang;Front Genet,2017
Cited by
20 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献