Prediction-oriented prognostic biomarker discovery with survival machine learning methods

Author:

Yao Sijie1ORCID,Cao Biwei1,Li Tingyi1,Kalos Denise1,Yuan Yading2,Wang Xuefeng1ORCID

Affiliation:

1. Department of Biostatistics and Bioinformatics, H. Lee Moffitt Cancer Center & Research Institute , Tampa , FL  33612, USA

2. Department of Radiation Oncology, Icahn School of Medicine at Mount Sinai , New York City , NY  10029, USA

Abstract

Abstract Identifying novel and reliable prognostic biomarkers for predicting patient survival outcomes is essential for deciding personalized treatment strategies for diseases such as cancer. Numerous feature selection techniques have been proposed to address the high-dimensional problem in constructing prediction models. Not only does feature selection lower the data dimension, but it also improves the prediction accuracy of the resulted models by mitigating overfitting. The performances of these feature selection methods when applied to survival models, on the other hand, deserve further investigation. In this paper, we construct and compare a series of prediction-oriented biomarker selection frameworks by leveraging recent machine learning algorithms, including random survival forests, extreme gradient boosting, light gradient boosting and deep learning-based survival models. Additionally, we adapt the recently proposed prediction-oriented marker selection (PROMISE) to a survival model (PROMISE-Cox) as a benchmark approach. Our simulation studies indicate that boosting-based approaches tend to provide superior accuracy with better true positive rate and false positive rate in more complicated scenarios. For demonstration purpose, we applied the proposed biomarker selection strategies to identify prognostic biomarkers in different modalities of head and neck cancer data.

Funder

National Institutes of Health

National Cancer Institute

Publisher

Oxford University Press (OUP)

Subject

Applied Mathematics,Computer Science Applications,Genetics,Molecular Biology,Structural Biology

Reference37 articles.

1. Regression models and life-tables;Cox;J. R. Stat. Soc. B Methodol.,1972

2. Regression shrinkage and selection via the lasso;Tibshirani;J. R. Stat. Soc. B Methodol.,1996

3. Regularization and variable selection via the elastic net;Zou;J. R. Stat. Soc. B Stat. Methodol.,2005

4. Prediction-oriented marker selection (PROMISE): with application to high-dimensional regression;Kim;Stat. Biosci.,2017

5. XGBoost: a scalable tree boosting system;Chen,2016

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3