Forecasting mental states in schizophrenia using digital phenotyping data: a retrospective study (Preprint)

Author:

Jean Thierry,Guay Hottin Rose,Orban Pierre

Abstract

BACKGROUND

Machine learning models that exploit rich digital phenotyping data to forecast mental states could improve clinical practice in psychiatry. A current limitation is that such predictive models tend to disregard a frequent property of the prediction targets, namely the ordinal nature of the rating scales they are often obtained from.

OBJECTIVE

We first aimed to contrast the performances of ordinal regression vs. binary classification to predict various mental states for different forecast horizons. We also assessed how the tree-based eXtreme Gradient Boosting (XGBoost) algorithm performs compared to the long short-term memory (LSTM) algorithm, a popular type of recurrent neural network in digital phenotyping studies aimed at forecasting mental states.

METHODS

The CrossCheck dataset includes self-reports of mental states and smartphone sensor data contributed by patients with schizophrenia. Participants completed surveys on various mental states every 2-3 days on 4-point ordinal rating scales. Passive sensing data was collected continuously and aggregated over 6-hour periods. We trained 120 machine learning models to forecast mental states from passive sensing data: 10 mental states (e.g., Calm, Depressed, Seeing things) on 2 predictive tasks (ordinal regression, binary classification) with 2 learning algorithms (XGBoost, LSTM) over 3 forecast horizons (same day, next day, next week). While models were primarily evaluated with performance metrics that account for class imbalance (macro-averaged mean absolute error -MAMAE- for ordinal regression and balanced accuracy -BAcc- for binary classification), the impact of using metrics that do not deal with imbalance (mean absolute error, accuracy) was also investigated.

RESULTS

The dataset included 6364 surveys and 23,551 days of smartphone data from 62 participants. Marked class imbalance was observed for the ordinal labels, an issue that was only partially resolved by recoding original labels into binary classes. Globally, 45/60 ordinal regression models performed significatively above baseline with MAMAE between 1.19 and 0.77, and 58/60 binary classification models were significant with BAcc between 58% and 73%. Of note, evaluation metrics that do not deal with class imbalance erroneously reflected good performance. After scaling performance metrics to allow their comparison, ordinal regression and binary classification models achieved comparable performance on average. XGBoost models performed better or on par with LSTM models. As the forecast horizon expanded, a significant yet very small decrease in performance was observed.

CONCLUSIONS

The targets of mental state forecast models should preserve the valuable clinical information contained in ordinal rating scales. This is especially true given recoding multiple ordinal classes into binary classes does not lead to any gain in predictive performance. Moreover, model development should account for class imbalance, particularly so for ordinal regression where imbalance across classes is often more pronounced. Finally, our findings do not lend support to the implicitly assumed superiority of recurrent neural networks for forecasting.

Publisher

JMIR Publications Inc.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3