Affiliation:
1. Tongde Hospital of Zhejiang Province
2. First Affiliated Hospital of Bengbu Medical College
3. Zhejiang Hospital
Abstract
Abstract
Backgroud: To predict the malignancy of 1-5 cm gastric gastrointestinal stromal tumors (GISTs) in a CT risk assessment by machine learning (ML) using three models - Logistic Regression (LR), Decision Tree (DT) and Gradient Boosting Decision Tree (GBDT).
Methods: 309 patients with gastric GISTs enrolled were divided into three cohorts for training (n=161), as well as internal validation (n=70) and external validation (n=78). Scikit-learn software was used to build three classifiers. Sensitivity, specificity, accuracy and area under the curve (AUC) were calculated to evaluate the performance of three models. The diagnostic difference between ML models and radiologists were compared in internal validation cohort. Important features were analyzed and compared in LR and GBDT.
Results: GBDT achieved the largest AUC values (0.981 and 0.815) among three classifiers in training and internal validation cohorts and greatest accuracy (0.923, 0.833 and 0.844) in three cohorts. LR was found to have the largest AUC value (0.910) in external validation cohort. DT yielded the worst accuracy (0.790 and 0.727) and AUC (0.803 and 0.700) both in two validation cohorts. GBDT and LR showed more favorable performances than two radiologists. Long diameter was demonstrated to be the same and most important CT feature for GBDT and LR.
Conclusions: ML classifiers were considered to be promising in prediction of risk classification of gastric GISTs less than 5 cm based on CT, especially GBDT and LR due to the high accuracy and strong robustness. Long diameter was found as the most important feature for risk stratification.
Publisher
Research Square Platform LLC