Affiliation:
1. Engineering, Computer and Mathematical Sciences, Auckland University of Technology, Auckland 1010, New Zealand
2. Biodesign Lab, New Zealand College of Chiropractic, Auckland 1010, New Zealand
3. Center of Chiropractic Research, New Zealand College of Chiropractic, Auckland 1010, New Zealand
4. Center for Sensory-Motor Interaction, Department of Health Science and Technology, Aalborg University, 9220 Aalborg, Denmark
5. Department of Clinical Sciences, Auckland University of Technology, Aukcland 1010, New Zealand
Abstract
This study examines the performance of various machine learning (ML) models in predicting Interstitial Glucose (IG) levels using data from wrist-worn wearable sensors. The insights from these predictions can aid in understanding metabolic syndromes and disease states. A public dataset comprising information from the Empatica E4 smart watch, the Dexcom Continuous Glucose Monitor (CGM) measuring IG, and a food log was utilized. The raw data were processed into features, which were then used to train different ML models. This study evaluates the performance of decision tree (DT), support vector machine (SVM), Random Forest (RF), Linear Discriminant Analysis (LDA), K-Nearest Neighbors (KNN), Gaussian Naïve Bayes (GNB), lasso cross-validation (LassoCV), Ridge, Elastic Net, and XGBoost models. For classification, IG labels were categorized into high, standard, and low, and the performance of the ML models was assessed using accuracy (40–78%), precision (41–78%), recall (39–77%), F1-score (0.31–0.77), and receiver operating characteristic (ROC) curves. Regression models predicting IG values were evaluated based on R-squared values (−7.84–0.84), mean absolute error (5.54–60.84 mg/dL), root mean square error (9.04–68.07 mg/dL), and visual methods like residual and QQ plots. To assess whether the differences between models were statistically significant, the Friedman test was carried out and was interpreted using the Nemenyi post hoc test. Tree-based models, particularly RF and DT, demonstrated superior accuracy for classification tasks in comparison to other models. For regression, the RF model achieved the lowest RMSE of 9.04 mg/dL with an R-squared value of 0.84, while the GNB model performed the worst, with an RMSE of 68.07 mg/dL. A SHAP analysis identified time from midnight as the most significant predictor. Partial dependence plots revealed complex feature interactions in the RF model, contrasting with the simpler interactions captured by LDA.
Reference37 articles.
1. Maged, Y., and Atia, A. (2022, January 8–9). The Prediction Of Blood Glucose Level By Using The ECG Sensor of Smartwatches. Proceedings of the 2022 2nd International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC), Cairo, Egypt.
2. Non-invasive wearables for remote monitoring of HbA1c and glucose variability: Proof of concept;Bent;BMJ Open Diabetes Res. Care,2021
3. International Diabetes Federation (2024, June 03). IDF Diabetes Atlas Tenth Edition 2021. Available online: https://diabetesatlas.org/.
4. Prevalence of the Metabolic Syndrome in the United States, 2003–2012;Aguilar;JAMA,2015
5. CDC (2024, June 03). National Diabetes Statistics Report, Diabetes, Available online: https://www.cdc.gov/diabetes/php/data-research/index.html.