Performance comparison of ten state-of-the-art machine learning algorithms for outcome prediction modeling of radiation-induced toxicity

Author:

Salazar Ramon M.ORCID,Nair Saurabh S.,Leone Alexandra O.,Xu Ting,Mumme Raymond P.,Duryea Jack D.,De BrianORCID,Corrigan Kelsey L.,Rooney Michael K.ORCID,Ning Matthew S.ORCID,Das Prajnan,Holliday Emma B.,Liao Zhongxing,Court Laurence E.ORCID,Niedzielski Joshua S.

Abstract

AbstractPurposeTo evaluate the efficacy of prominent machine learning algorithms in predicting normal tissue complication probability utilizing clinical data obtained from two distinct disease sites, and to create a software tool that facilitates the automatic determination of the optimal algorithm to model any given labeled dataset.Methods and MaterialsWe obtained 3 sets of radiation toxicity data (478 patients) from our clinic, gastrointestinal toxicity (GIT), radiation pneumonitis (RP), and radiation esophagitis (RE). These data comprised clinicopathological and dosimetric information for patients diagnosed with non-small cell lung cancer and anal squamous cell carcinoma. Each dataset was modeled using ten commonly employed machine learning algorithms (elastic net, LASSO, random forest, regression forest, support vector machine, XGBoost, k-nearest- neighbors, neural network, Bayesian-LASSO, and Bayesian neural network) by randomly dividing the dataset into a training and test set. The training set was used to create and tune the model, and the test set served to assess it by calculating performance metrics. This process was repeated 100 times by each algorithm for each dataset. Figures were generated to visually compare the performance of the algorithms. A graphical user interface was developed to automate this whole process.ResultsLASSO achieved the highest area under the precision-recall curve (AUPRC) (0.807±0.067) for RE, random forest for GIT (0.726±0.096), and the neural network for RP (0.878±0.060). Area-under-the-curve was 0.754±0.069, 0.889±0.043, and 0.905±0.045, respectively. The graphical user interface was used to compare all algorithms for each dataset automatically. When averaging AUPRC across all toxicities, Bayesian-LASSO was the best model.ConclusionOur results show that there is no best algorithm for all datasets. Therefore, it is important to compare multiple algorithms when training an outcome prediction model on a new dataset. The graphical user interface created for this study automatically compares the performance of these ten algorithms for any dataset.

Publisher

Cold Spring Harbor Laboratory

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3