Abstract
Background
Engagement is key to interventions that achieve successful behavior change and improvements in health. There is limited literature on the application of predictive machine learning (ML) models to data from commercially available weight loss programs to predict disengagement. Such data could help participants achieve their goals.
Objective
This study aimed to use explainable ML to predict the risk of member disengagement week by week over 12 weeks on a commercially available web-based weight loss program.
Methods
Data were available from 59,686 adults who participated in the weight loss program between October 2014 and September 2019. Data included year of birth, sex, height, weight, motivation to join the program, use statistics (eg, weight entries, entries into the food diary, views of the menu, and program content), program type, and weight loss. Random forest, extreme gradient boosting, and logistic regression with L1 regularization models were developed and validated using a 10-fold cross-validation approach. In addition, temporal validation was performed on a test cohort of 16,947 members who participated in the program between April 2018 and September 2019, and the remaining data were used for model development. Shapley values were used to identify globally relevant features and explain individual predictions.
Results
The average age of the participants was 49.60 (SD 12.54) years, the average starting BMI was 32.43 (SD 6.19), and 81.46% (39,594/48,604) of the participants were female. The class distributions (active and inactive members) changed from 39,369 and 9235 in week 2 to 31,602 and 17,002 in week 12, respectively. With 10-fold-cross-validation, extreme gradient boosting models had the best predictive performance, which ranged from 0.85 (95% CI 0.84-0.85) to 0.93 (95% CI 0.93-0.93) for area under the receiver operating characteristic curve and from 0.57 (95% CI 0.56-0.58) to 0.95 (95% CI 0.95-0.96) for area under the precision-recall curve (across 12 weeks of the program). They also presented a good calibration. Results obtained with temporal validation ranged from 0.51 to 0.95 for area under a precision-recall curve and 0.84 to 0.93 for area under the receiver operating characteristic curve across the 12 weeks. There was a considerable improvement in area under a precision-recall curve of 20% in week 3 of the program. On the basis of the computed Shapley values, the most important features for predicting disengagement in the following week were those related to the total activity on the platform and entering a weight in the previous weeks.
Conclusions
This study showed the potential of applying ML predictive algorithms to help predict and understand participants’ disengagement with a web-based weight loss program. Given the association between engagement and health outcomes, these findings can prove valuable in providing better support to individuals to enhance their engagement and potentially achieve greater weight loss.
Reference41 articles.
1. Chronic conditions in AustraliaAustralian Government Department of Health and Aged Care2022-05-15https://www.health.gov.au/health-topics/chronic-conditions/chronic-conditions-in-australia
2. National Center for Chronic Disease Prevention and Health Promotion (NCCDPHP)Centers for Disease Control and Prevention2022-05-14https://www.cdc.gov/chronicdisease/resources/infographic/chronic-diseases.htm
3. Australian burden of disease study impact and causes of illness and death in AustraliaAustralian Institute of Health and Welfare20182022-05-14https://www.aihw.gov.au/getmedia/5ef18dc9-414f-4899-bb35-08e239417694/aihw-bod-29.pdf.aspx?inline=true
4. A picture of overweight and obesity in AustraliaAustralian Institute of Health and Welfare20172022-05-14https://www.aihw.gov.au/getmedia/172fba28-785e-4a08-ab37-2da3bbae40b8/aihw-phe-216.pdf.aspx?inline=true
5. National health survey: first resultsAustralian Bureau of Statistics20172022-05-14http://www.abs.gov.au/ausstats/abs@.nsf/mf/4364.0.55.001