Affiliation:
1. Department of Colorectal Cancer Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital Chinese Academy of Medical Sciences and Peking Union Medical College Beijing China
2. Department of Gastrointestinal Surgery, The First Affiliated Hospital Sun Yat‐sen University Guangzhou Guangdong China
Abstract
AbstractBackgroundMore accurate prediction of distant metastases (DM) in patients with colorectal cancer (CRC) would optimize individualized treatment and follow‐up strategies. Multiple prediction models based on machine learning have been developed to assess the likelihood of developing DM.MethodsClinicopathological features of patients with CRC were obtained from the National Cancer Center (NCC, China) and the Surveillance, Epidemiology, and End Results (SEER) database. The algorithms used to create the prediction models included random forest (RF), logistic regression, extreme gradient boosting, deep neural networks, and the K‐Nearest Neighbor machine. The prediction models' performances were evaluated using receiver operating characteristic (ROC) curves.ResultsIn total, 200,958 patients, 3241 from NCC and 197,717 CRC from SEER were identified, of whom 21,736 (10.8%) developed DM. The machine‐learning‐based prediction models for DM were constructed with 12 features remaining after iterative filtering. The RF model performed the best, with areas under the ROC curve of 0.843, 0.793, and 0.806, respectively, on the training, test, and external validation sets. For the risk stratification analysis, the patients were separated into high‐, middle‐, and low‐risk groups according to their risk scores. Patients in the high‐risk group had the highest incidence of DM and the worst prognosis. Surgery, chemotherapy, and radiotherapy could significantly improve the prognosis of the high‐risk and middle‐risk groups, whereas the low‐risk group only benefited from surgery and chemotherapy.ConclusionThe RF‐based model accurately predicted the likelihood of DM and identified patients with CRC in the high‐risk group, providing guidance for personalized clinical decision‐making.