Machine learning of language use on Twitter reveals weak and non-specific predictions

Author:

Kelley Sean W.ORCID,Mhaonaigh Caoimhe Ní,Burke Louise,Whelan RobertORCID,Gillan Claire M.ORCID

Abstract

AbstractDepressed individuals use language differently than healthy controls and it has been proposed that social media posts can be used to identify depression. Much of the evidence behind this claim relies on indirect measures of mental health and few studies have tested if these language features are specific to depression versus other aspects of mental health. We analysed the Tweets of 1006 participants who completed questionnaires assessing symptoms of depression and 8 other mental health conditions. Daily Tweets were subjected to textual analysis and the resulting linguistic features were used to train an Elastic Net model on depression severity, using nested cross-validation. We then tested performance in a held-out test set (30%), comparing predictions of depression versus 8 other aspects of mental health. The depression trained model had modest out-of-sample predictive performance, explaining 2.5% of variance in depression symptoms (R2 = 0.025, r = 0.16). The performance of this model was as-good or superior when used to identify other aspects of mental health: schizotypy, social anxiety, eating disorders, generalised anxiety, above chance for obsessive-compulsive disorder, apathy, but not significant for alcohol abuse or impulsivity. Machine learning analysis of social media data, when trained on well-validated clinical instruments, could not make meaningful individualised predictions regarding users’ mental health. Furthermore, language use associated with depression was non-specific, having similar performance in predicting other mental health problems.

Funder

SFI-HRB-Wellcome Trust

Publisher

Springer Science and Business Media LLC

Subject

Health Information Management,Health Informatics,Computer Science Applications,Medicine (miscellaneous)

Reference91 articles.

1. Abuse, S. Mental Health Services Administration. Key substance use and mental health indicators in the United States: results from the 2018 National Survey on Drug Use and Health (HHS Publication No. PEP19-5068, NSDUH Series H-54) (Center for Behavioral Health Statistics and Quality, Substance Abuse and Mental Health Services Administration, Rockville, MD, 2019).

2. Lépine, J.-P., Gastpar, M., Mendlewicz, J. & Tylee, A. Depression in the community: the first pan-European study DEPRES (Depression Research in European Society). Int. Clin. Psychopharmacol. 12, 19–29 (1997).

3. Ghio, L., Gotelli, S., Marcenaro, M., Amore, M. & Natta, W. Duration of untreated illness and outcomes in unipolar depression: a systematic review and meta-analysis. J. Affect. Disord. 152, 45–51 (2014).

4. Perrin, A. Social media usage. Pew Res. Cent. 125, 52–68 (2015).

5. De Choudhury, M., Gamon, M., Counts, S. & Horvitz, E. Predicting depression via social media. International AAAI Conference on Web and Social Media. 2, 128–137 (AAAI, 2013).

Cited by 11 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3