Predicting colorectal cancer recurrence by utilizing multiple-view multiple-learner supervised learning.

Author:

Castellanos Jason1,Liu Qi1,Beauchamp R. Daniel2,Zhang Bing3

Affiliation:

1. Vanderbilt University Medical Center, Nashville, TN;

2. Vanderbilt Univ Medcl Ctr, Nashville, TN;

3. Baylor College of Medicine, Houston, TX

Abstract

635 Background: Colorectal cancer (CRC) remains a leading cause of cancer-related mortality in the United States. A key therapeutic dilemma in the treatment of CRC is whether patients with stage II and stage III disease require adjuvant chemotherapy after surgical resection. Attempts to improve identification of patients at increased risk of recurrence have yielded many predictive models based on gene expression data, but none are FDA approved and none are used in standard clinical practice. To improve recurrence prediction, we utilize a machine learning approach to predict recurrence status at 3 years after diagnosis. Methods: A dataset was curated from six publically available microarray datasets, and multiple views were generated to include information from non-tumor tissue gene expression patterns, gene set structure, protein-protein interaction network structure, previously curated molecular signatures, and identified tumor suppressor/driver mutations. These views were used to train a diverse pool of base learners using 10x 10-fold cross-validation. Stacked generalization was used to train an ensemble model, also known as a meta-learner, from the predictions of these base learners. Results: The performance of microarray trained models was significantly better compared to models trained on clinical data (Paired Wilcoxon signed rank test, p = 1.49 x 10-8), demonstrating that molecular data predicts recurrence significantly better than basic clinical data. Review of the model training performances revealed that non-linear classifiers often outperform linear classifiers, and that ensemble methods can also enhance performance. We also demonstrate the feasibility of the multiple-view multiple learner (MVML) supervised learning framework to generate and integrate predictions across a diverse set of learners, with the performance of the meta-learner exceeding or matching that of the best base learners across all performance metrics. Conclusions: This work represents the first effort to use ensemble learning to predict CRC recurrence and highlights the promise of ensemble learning to improve the performance of predictive models in order to realize the goals of precision medicine.

Publisher

American Society of Clinical Oncology (ASCO)

Subject

Cancer Research,Oncology

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Predictive models for colorectal cancer recurrence using multi-modal healthcare data;Proceedings of the Conference on Health, Inference, and Learning;2021-04-08

2. The Rise of Big Data in Oncology;Seminars in Oncology Nursing;2018-05

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3