Integrating ensemble systems biology feature selection and bimodal deep neural network for breast cancer prognosis prediction

Author:

Cheng Li-Hsin,Lin Che

Abstract

AbstractMotivationBreast cancer is a heterogeneous disease. In order to guide proper treatment decisions for each individual patient, there is an urgent need for robust prognostic biomarkers that allow reliable prognosis prediction. Gene feature selection on microarray data is an approach to systematically discover potential biomarkers. However, common pure-statistical feature selection approaches often fail to incorporate prior biological knowledge and thus tend to select genes that lack biological insights. In addition, due to the high dimensionality and low sample size properties of microarray data, selecting robust gene features is an intrinsically challenging problem. We therefore combined systems biology feature selection with ensemble learning in this study, aiming to address the above challenges and select genes with biological insights, as well as robust prognostic predictive power. Moreover, in order to capture the complex molecular processes of breast cancer, where multiple disease-contributing genes may exist and interact, we adopted a multi-gene approach to predict the prognosis status using machine learning classifiers.ResultsWe systematically evaluated three different ensemble approaches that all improved the original systems biology feature selector. We found that compared to the most popular data-perturbation approach, function perturbation can produce significant improvement with just a few ensembles. Among all, the hybrid ensemble approach led to the most robust feature selection result, and the identified genes were shown to be highly involved in pathways, such as ubiquitination and cell cycle. Final prognosis prediction models were constructed using the identified genes and clinical information as input features. Among all models, bimodal deep neural network (DNN) achieved the highest AUC (area under receiver operating characteristic curve) in test performance evaluation, where subsequent survival analysis also verified its ability to differentiate patients with different prognosis statuses. In summary, the study demonstrated the potential of ensemble learning to improve gene feature selection robustness, as well as the potential of bimodal DNN in providing reliable prognosis prediction and guiding precision medicine.

Publisher

Cold Spring Harbor Laboratory

Reference47 articles.

1. Robust biomarker identification for cancer diagnosis with ensemble feature selection methods

2. Support vector machines combined with feature selection for breast cancer diagnosis

3. Supervised, unsupervised, and semi-supervised feature selection: A review on gene selection;IEEE/ACM Trans. Comput. Biol. Bioinforma,2016

4. Cross-Talk between AURKA and Plk1 in Mitotic Entry and Spindle Assembly;Front. Oncol,2015

5. Awada, W. et al. (2012) A review of the stability of feature selection techniques for bioinformatics data. Proc. 2012 IEEE 13th Int. Conf. Inf. Reuse Integr. IRI 2012, 356–363.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3