Author:
Ahn Ji Hyun,Kwak Min Seob,Lee Hun Hee,Cha Jae Myung,Shin Hyun Phil,Jeon Jung Won,Yoon Jin Young
Abstract
BackgroundIdentification of a simplified prediction model for lymph node metastasis (LNM) for patients with early colorectal cancer (CRC) is urgently needed to determine treatment and follow-up strategies. Therefore, in this study, we aimed to develop an accurate predictive model for LNM in early CRC.MethodsWe analyzed data from the 2004-2016 Surveillance Epidemiology and End Results database to develop and validate prediction models for LNM. Seven models, namely, logistic regression, XGBoost, k-nearest neighbors, classification and regression trees model, support vector machines, neural network, and random forest (RF) models, were used.ResultsA total of 26,733 patients with a diagnosis of early CRC (T1) were analyzed. The models included 8 independent prognostic variables; age at diagnosis, sex, race, primary site, histologic type, tumor grade, and, tumor size. LNM was significantly more frequent in patients with larger tumors, women, younger patients, and patients with more poorly differentiated tumor. The RF model showed the best predictive performance in comparison to the other method, achieving an accuracy of 96.0%, a sensitivity of 99.7%, a specificity of 92.9%, and an area under the curve of 0.991. Tumor size is the most important features in predicting LNM in early CRC.ConclusionWe established a simplified reproducible predictive model for LNM in early CRC that could be used to guide treatment decisions. These findings warrant further confirmation in large prospective clinical trials.
Funder
National Research Foundation of Korea