Abstract
Abstract
Aims
To evaluate the predictive capabilities of various machine learning models for in-hospital mortality in patients diagnosed with acute paraquat poisoning(APP).
Methods
From September 2010 to January 2022, patients were identified retrospectively from the emergency departments of West China Hospital, Sichuan University, People's Republic of China. A total of 724 patients were randomly divided into a training set (80% of subjects) and a validation set (20% of subjects). The least absolute shrinkage and selection operator (LASSO) method was utilized to identify significant features associated with APP, and nine machine learning models were constructed. Model evaluation was carried out in the validation set, using a range of evaluation metrics such as accuracy, precision, recall, F-measure, the area under the receiver operating characteristic curve(AUC), Precision-Recall curve (PRC), and Clinical decision curve analysis(DCA). The CatBoost model was employed to predict in-hospital mortality in patients with APP, and the ibreakdown and SHapley Additive exPlanations (SHAP) package in R were used to interpret the CatBoost model.
Results
A group of 724 individuals who had suffered from APP were enrolled, of whom 360 had passed away. During feature selection, six variables were chosen as predictive indicators for the model. In feature selection, 6 variables were selected as model predicting indicators. Compared with Adaptive Boosting(AdaBoost), CatBoost, Decision Tree(DT), Gradient Boosting Decision Tree(GBDT), Light Gradient Boosting Machine(LightGBM), Logistic Classification, Random Forest(RF), Support Vector Machine(SVM), eXtremeGradient Boosting(XGBoost) improved the classification prediction performance, respectively. CatBoost was the best-performing model (accuracy = 1, precision = 1, recall = 1, F-measure = 1, and AUC = 1). Furthermore, PRC and DCA indicated that the model had a excellent predictive performance.
Conclusions
By utilizing machine learning models, the likelihood of in-hospital mortality in patients with APP can be predicted with precision and dependability. Of the various ensemble learning models tested, including RF, AdaBoost, CatBoost, GBDT, LightGBM, and XGBoost, CatBoost exhibited nearly flawless performance. These results demonstrate the feasibility of integrating machine learning models into electronic health records to facilitate informed care and service planning.
Publisher
Research Square Platform LLC