Abstract
Abstract
Background
Gastric cancer is one of the leading causes of death worldwide. Screening for gastric cancer greatly relies on endoscopy and pathology biopsy, which are invasive and pose financial burdens. Thus, the prevention of the disease by modifying lifestyle-related behaviors and dietary habits or even the prevention of risk factor formation is of great importance. This study aimed to construct an inexpensive, non-invasive, fast, and high-precision diagnostic model using six machine learning (ML) algorithms to classify patients at high or low risk of developing gastric cancer by analyzing individual lifestyle factors.
Methods
This retrospective study used the data of 2029 individuals from the gastric cancer database of Ayatollah Taleghani Hospital in Abadan City, Iran. The data were randomly separated into training and test sets (ratio 0.7:0.3). Six ML methods, including multilayer perceptron (MLP), support vector machine (SVM) (linear kernel), SVM (RBF kernel), k-nearest neighbors (KNN) (K = 1, 3, 7, 9), random forest (RF), and eXtreme Gradient Boosting (XGBoost), were trained to construct prognostic models before and after performing the relief feature selection method. Finally, to evaluate the models’ performance, the metrics derived from the confusion matrix were calculated via a test split and cross-validation.
Results
This study found 11 important influence factors for the risk of gastric cancer, such as Helicobacter pylori infection, high salt intake, and chronic atrophic gastritis, among other factors. Comparisons indicated that the XGBoost had the best performance for the risk prediction of gastric cancer.
Conclusions
The results suggest that based on simple baseline patient data, the ML techniques have the potential to start the prescreening of gastric cancer and identify high-risk individuals who should proceed with invasive examinations. Our model could also considerably lessen the number of cases that need endoscopic surveillance. Future studies are required to validate the efficacy of the models in a larger and multicenter population.
Publisher
Springer Science and Business Media LLC
Subject
Gastroenterology,General Medicine
Cited by
14 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献