Author:
Hu Xiaoqi,Hu Xiaolin,Yu Ya,Wang Jia
Abstract
ObjectiveTo develop the extreme gradient boosting (XG Boost) machine learning (ML) model for predicting gestational diabetes mellitus (GDM) compared with a model using the traditional logistic regression (LR) method.MethodsA case–control study was carried out among pregnant women, who were assigned to either the training set (these women were recruited from August 2019 to November 2019) or the testing set (these women were recruited in August 2020). We applied the XG Boost ML model approach to identify the best set of predictors out of a set of 33 variables. The performance of the prediction model was determined by using the area under the receiver operating characteristic (ROC) curve (AUC) to assess discrimination, and the Hosmer–Lemeshow (HL) test and calibration plots to assess calibration. Decision curve analysis (DCA) was introduced to evaluate the clinical use of each of the models.ResultsA total of 735 and 190 pregnant women were included in the training and testing sets, respectively. The XG Boost ML model, which included 20 predictors, resulted in an AUC of 0.946 and yielded a predictive accuracy of 0.875, whereas the model using a traditional LR included four predictors and presented an AUC of 0.752 and yielded a predictive accuracy of 0.786. The HL test and calibration plots show that the two models have good calibration. DCA indicated that treating only those women whom the XG Boost ML model predicts are at risk of GDM confers a net benefit compared with treating all women or treating none.ConclusionsThe established model using XG Boost ML showed better predictive ability than the traditional LR model in terms of discrimination. The calibration performance of both models was good.
Subject
Endocrinology, Diabetes and Metabolism
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献