BACKGROUND
Digital mental health is a promising paradigm for individualized, patient-driven healthcare. For example, cognitive bias modification programs that target interpretation biases (CBM-I) can provide practice thinking about ambiguous situations in less threatening ways online without requiring a therapist. However, digital mental health interventions, including CBM-I, are often plagued with lack of sustained engagement and high attrition rates. New attrition detection and mitigation strategies are needed to improve these interventions.
OBJECTIVE
The present analyses aimed to identify participants at high risk of dropout during the early stage of three web-based trials of multi-session CBM-I and to investigate which self-reported and passively detected feature sets from the intervention and assessment data were most informative in making this prediction.
METHODS
Participants were community adults with trait anxiety or negative future thinking (Study 1 N = 252, Study 2 N = 326, Study 3 N = 699) who had been assigned to CBM-I conditions in three efficacy-effectiveness trials on our team’s public research website. To identify participants at high risk of dropout, we created four unique feature sets: self-reported baseline user characteristics (e.g., demographics), self-reported user context and reactions to the program (e.g., state affect), self-reported user clinical functioning (e.g., mental health symptoms), and passively detected user behavior on the website (e.g., time spent on a web page of CBM-I training exercises; time of day; latency of completing assessments; type of device used). Then, we investigated the feature sets as potential predictors of which participants were at high risk of not starting the second training session of a given program using well-known machine learning algorithms.
RESULTS
The extreme gradient boosting algorithm (XGBoost) performed the best and identified high-risk participants with F1-macro scores of .832 (Study 1 with 146 features), .770 (Study 2 with 87 features), and .917 (Study 3 with 127 features). Features involving passive detection of user behavior contributed the most to the prediction relative to other features (mean Gini importance scores and 95% CIs = .033 ± .014 in Study 1; .029 ± .006 in Study 2; .045 ± .006 in Study 3). However, using all features extracted from a given study led to the best predictive performance.
CONCLUSIONS
These results suggest that using passive indicators of user behavior, alongside self-reported measures, can improve prediction of participants at high risk of dropout early in the course of multi-session CBM-I programs. Further, our analyses highlight the challenge of generalizability in digital health intervention studies and the need for more personalized attrition prevention strategies.