BACKGROUND
Engagement is key to interventions achieving successful behavior change and improvements in health. There is scarce evidence in the literature on applying predictive machine learning (ML) models to data from commercially available weight loss programs to predict disengagement. Such data could help in supporting participants to achieve their goals.
OBJECTIVE
This study aimed to use explainable machine learning to predict member risk of disengagement over each of the 12 weeks on a commercially available online weight loss program.
METHODS
Data were available from 59,686 adults who joined the commercial weight loss program between October 2014 and September 2019. Data included year of birth, sex, height, weight, motivation for joining the program, usage statistics (e.g., weight entries, entries into the food diary, views of the menu, and program content), program type and weight loss. Random forest (RF), boosted trees (XGB) and logistic regression (LR) with regularization l1 (L1) models were developed and validated with a 10-fold cross-validation approach. In addition, temporal validation was performed on a test cohort of 16,947 members who participated in the program between April 2018 and September 2019, while the remainder of the data was used for model development. Shapley values were used to identify globally relevant features and explain individual predictions.
RESULTS
The average age of participants was 49.60 (SD 12.54) years, the average starting BMI was 32.43 (SD 6.19) and 81.5% of participants were females. Class distributions (active/inactive members) changed from 39,369/9,235 in week 2 to 31,602/17,002 in week 12. With 10-fold-cross-validation, XGB models had the best predictive performance which ranged from 0.85 (95%CI 0.84 0.85) to 0.93 (95%CI 0.93 0.93) for ROC AUC and from 0.57 (95%CI 0.56 0.58) to 0.95 (95%CI 0.95 0.96) for AUC PRC across 12 weeks of the program. They also presented good calibration. Results obtained with temporal validation ranged from 0.51 to 0.95 for AUC PRC and 0.84 to 0.93 for ROC AUC across the 12 weeks. There was a considerable improvement in AUC PRC of 20% in week 3 of the program. Based on computed Shapley values, the most important features for predicting disengagement in the following week were those related to total activity on the platform and entering a weight in the previous weeks.
CONCLUSIONS
This study showed the potential of applying ML predictive algorithms to help to understand participants’ disengagement with an online weight loss program. Given the association between engagement and health outcomes, these findings will be useful to better support individuals to achieve greater engagement and potentially weight loss on the 12-week program.