Transfer Learning for Risk Classification of Social Media Posts: Model Evaluation Study-Reference-Cited by-同舟云学术

Transfer Learning for Risk Classification of Social Media Posts: Model Evaluation Study

Published:2020-05-13 Issue:5 Volume:22 Page:e15371
ISSN:1438-8871
Container-title:Journal of Medical Internet Research
language:en
Short-container-title:J Med Internet Res

Author:

Howard Derek^ORCID,Maslej Marta M^ORCID,Lee Justin^ORCID,Ritchie Jacob^ORCID,Woollard Geoffrey^ORCID,French Leon^ORCID

Abstract

Background Mental illness affects a significant portion of the worldwide population. Online mental health forums can provide a supportive environment for those afflicted and also generate a large amount of data that can be mined to predict mental health states using machine learning methods. Objective This study aimed to benchmark multiple methods of text feature representation for social media posts and compare their downstream use with automated machine learning (AutoML) tools. We tested on datasets that contain posts labeled for perceived suicide risk or moderator attention in the context of self-harm. Specifically, we assessed the ability of the methods to prioritize posts that a moderator would identify for immediate response. Methods We used 1588 labeled posts from the Computational Linguistics and Clinical Psychology (CLPsych) 2017 shared task collected from the Reachout.com forum. Posts were represented using lexicon-based tools, including Valence Aware Dictionary and sEntiment Reasoner, Empath, and Linguistic Inquiry and Word Count, and also using pretrained artificial neural network models, including DeepMoji, Universal Sentence Encoder, and Generative Pretrained Transformer-1 (GPT-1). We used Tree-based Optimization Tool and Auto-Sklearn as AutoML tools to generate classifiers to triage the posts. Results The top-performing system used features derived from the GPT-1 model, which was fine-tuned on over 150,000 unlabeled posts from Reachout.com. Our top system had a macroaveraged F1 score of 0.572, providing a new state-of-the-art result on the CLPsych 2017 task. This was achieved without additional information from metadata or preceding posts. Error analyses revealed that this top system often misses expressions of hopelessness. In addition, we have presented visualizations that aid in the understanding of the learned classifiers. Conclusions In this study, we found that transfer learning is an effective strategy for predicting risk with relatively little labeled data and noted that fine-tuning of pretrained language models provides further gains when large amounts of unlabeled text are available.

Publisher

JMIR Publications Inc.

Subject

Health Informatics

Reference52 articles.

1. The Descriptive Epidemiology of Commonly Occurring Mental Disorders in the United States

2. Exploring Comorbidity Within Mental Disorders Among a Danish National Population

3. Prevalence, Severity, and Comorbidity of 12-Month DSM-IV Disorders in the National Comorbidity Survey Replication

4. Age of onset of mental disorders: a review of recent literature

5. Mental disorders, comorbidity and suicidal behavior: Results from the National Comorbidity Survey Replication

Cited by 27 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Adapting transformer-based language models for heart disease detection and risk factors extraction;Journal of Big Data;2024-04-04

2. Unraveling minds in the digital era: a review on mapping mental health disorders through machine learning techniques using online social media;Social Network Analysis and Mining;2024-04-04

3. Depression Symptom Identification Through Acoustic Speech Analysis: A Transfer Learning Approach;Traitement du Signal;2024-02-29

4. JMS-QA: A Joint Hierarchical Architecture for Mental Health Question Answering;IEEE/ACM Transactions on Audio, Speech, and Language Processing;2024

5. Domain Adaptation in Medical Imaging: Evaluating the Effectiveness of Transfer Learning;Studies in Big Data;2024