Detecting Depression Through Temporal Topic Modeling of Tweets: Insights from a 180-Day Study (Preprint)

Author:

Chandrasekaran RanganathanORCID,Kotaki Suhas,Nagaraja Abhilash Hosaagrahara

Abstract

BACKGROUND

Advent of social media platforms like X (formerly known as Twitter) provide a useful way to unobtrusively monitor and mine user-generated information and use advanced NLP and text-mining algorithms to detect mental illnesses such as depression.

OBJECTIVE

Using twitter data, this study examines how depression markers change over the progression of the disease in individuals. Our goals are (1) To analyze twitter data to identify temporal changes in depression markers 90 days before and after a clinical diagnosis, (2). To use topic modeling to extract and analyze key themes related to depression from the tweets of diagnosed individuals, (3) To evaluate the effectiveness of machine learning classifiers in distinguishing between depressed and non-depressed users based on tweet content, (4) To provide insights into how the progression of depression and its markers can be tracked and understood through temporal analysis of social media data.

METHODS

We identified 229 depressed individuals and gathered 246,637 tweets made by them over 180 days. CorEx topic modeling was used to mine the tweets to extract themes that characterize depression related discourse, followed by conditional logistic regression to assess odds of the themes occurring in tweets in post-diagnosis period, compared to pre-diagnosis period. Three machine learning classifiers (support vector machines, naive bayes and logistic regression) were built and tested to distinguish depressed users from others.

RESULTS

Our analysis yielded seven themes related to depression viz. causes, physical symptoms, mental symptoms, swear words, treatment, coping and support mechanisms, and lifestyle. Odds of tweeting about causes, physical symptoms, mental symptoms, treatment, and coping/support mechanisms in the post-diagnosis period were 2.22 (95% CI 1.29-3.82), 0.32 (95% CI, 0.14-0.71), 0.74 (95% CI 0.62-0.89), 3.1 (95% CI 1.71-5.61), 1.86 (95% CI 1.24-2.81), respectively. Among the machine learning classifiers tested, logistic regression yielded best performance (AUC=0.91) to classify depressed users from others.

CONCLUSIONS

Temporal analysis using twitter data helps in getting a comprehensive view of depression progression in patients. In addition to identifying changing comorbidities and mental symptoms, it can help in tracking patient’s use of coping and support mechanisms, treatments and causes of depression.

Publisher

JMIR Publications Inc.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3