BACKGROUND
Personal sensing, leveraging data passively and near-continuously collected with wearables from patients in their ecological environment, is a promising paradigm to monitor mood disorders (MDs), a major determinant of worldwide disease burden. However, collecting and annotating wearable data is very resource- intensive. Studies of this kind can thus typically afford to recruit only a couple dozens of patients. This constitutes one of the major obstacles to applying modern supervised machine learning techniques to MDs detection.
OBJECTIVE
In this paper, we overcome this data bottleneck and advance the detection of MDs acute episode vs stable state from wearables data on the back of recent advances in self-supervised learning (SSL). This approach leverages unlabeled data to learn representations during pre-training, subsequently exploited for a supervised task.
METHODS
We collected open-access datasets recording with an Empatica E4 spanning different, unrelated to MD monitoring, personal sensing tasks – from emotion recognition in Super Mario players to stress detection in undergraduates – and devised a pre-processing pipeline performing on-/off-body detection, sleep-wake detection, segmentation, and (optionally) feature extraction. With 161 E4-recorded subjects, we introduce E4SelfLearning, the largest to date open access collection, and its pre-processing pipeline1. We developed a novel E4-tailored Transformer architecture (E4mer), serving as blueprint for both SSL and fully supervised learning; we assessed whether and under which conditions self-supervised pretraining led to an improvement over two fully supervised baselines, i.e. the fully supervised E4mer and a classical baseline (XGBoost), in detecting acute mood episodes from recording segments taken in 64 (half acute, half stable) patients.
RESULTS
SSL confidently outperforms fully-supervised pipelines using either our novel E4mer or XGBoost: 81.23% against 75.35% (E4mer) and 72.02% (XGBoost) correctly classified recording segments. SSL performance is strongly associated with the specific surrogate task employed for pre-training as well as with unlabeled data availability.
CONCLUSIONS
We showed that SSL, a paradigm where a model is pre-trained on unlabeled data with no need for human annotations prior to deployment on the supervised target task of interest, helps overcome the annotation bottleneck; the choice of the pre-training surrogate task and the size of unlabeled data for pre-training are key determinants of SSL success. We introduced an E4-tailor Transformer architecture (E4mer) that can be used for SSL and share the E4SelfLearning collection, along with its preprocessing pipeline, which can foster and expedite future research into SSL for personal sensing.