Construction and validation of machine learning models for predicting distant metastases in newly diagnosed colorectal cancer patients: A large‐scale and real‐world cohort study

Author:

Wei Ran12,Yu Guanhua1,Wang Xishan1ORCID,Jiang Zheng1ORCID,Guan Xu1

Affiliation:

1. Department of Colorectal Cancer Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital Chinese Academy of Medical Sciences and Peking Union Medical College Beijing China

2. Department of Gastrointestinal Surgery, The First Affiliated Hospital Sun Yat‐sen University Guangzhou Guangdong China

Abstract

AbstractBackgroundMore accurate prediction of distant metastases (DM) in patients with colorectal cancer (CRC) would optimize individualized treatment and follow‐up strategies. Multiple prediction models based on machine learning have been developed to assess the likelihood of developing DM.MethodsClinicopathological features of patients with CRC were obtained from the National Cancer Center (NCC, China) and the Surveillance, Epidemiology, and End Results (SEER) database. The algorithms used to create the prediction models included random forest (RF), logistic regression, extreme gradient boosting, deep neural networks, and the K‐Nearest Neighbor machine. The prediction models' performances were evaluated using receiver operating characteristic (ROC) curves.ResultsIn total, 200,958 patients, 3241 from NCC and 197,717 CRC from SEER were identified, of whom 21,736 (10.8%) developed DM. The machine‐learning‐based prediction models for DM were constructed with 12 features remaining after iterative filtering. The RF model performed the best, with areas under the ROC curve of 0.843, 0.793, and 0.806, respectively, on the training, test, and external validation sets. For the risk stratification analysis, the patients were separated into high‐, middle‐, and low‐risk groups according to their risk scores. Patients in the high‐risk group had the highest incidence of DM and the worst prognosis. Surgery, chemotherapy, and radiotherapy could significantly improve the prognosis of the high‐risk and middle‐risk groups, whereas the low‐risk group only benefited from surgery and chemotherapy.ConclusionThe RF‐based model accurately predicted the likelihood of DM and identified patients with CRC in the high‐risk group, providing guidance for personalized clinical decision‐making.

Publisher

Wiley

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3