Examining the Gateway Hypothesis and Mapping Substance Use Pathways on Social Media: Machine Learning Approach

Author:

Yuan YunhaoORCID,Kasson ErinORCID,Taylor JordanORCID,Cavazos-Rehg PatriciaORCID,De Choudhury MunmunORCID,Aledavood TalayehORCID

Abstract

Background Substance misuse presents significant global public health challenges. Understanding transitions between substance types and the timing of shifts to polysubstance use is vital to developing effective prevention and recovery strategies. The gateway hypothesis suggests that high-risk substance use is preceded by lower-risk substance use. However, the source of this correlation is hotly contested. While some claim that low-risk substance use causes subsequent, riskier substance use, most people using low-risk substances also do not escalate to higher-risk substances. Social media data hold the potential to shed light on the factors contributing to substance use transitions. Objective By leveraging social media data, our study aimed to gain a better understanding of substance use pathways. By identifying and analyzing the transitions of individuals between different risk levels of substance use, our goal was to find specific linguistic cues in individuals’ social media posts that could indicate escalating or de-escalating patterns in substance use. Methods We conducted a large-scale analysis using data from Reddit, collected between 2015 and 2019, consisting of over 2.29 million posts and approximately 29.37 million comments by around 1.4 million users from subreddits. These data, derived from substance use subreddits, facilitated the creation of a risk transition data set reflecting the substance use behaviors of over 1.4 million users. We deployed deep learning and machine learning techniques to predict the escalation or de-escalation transitions in risk levels, based on initial transition phases documented in posts and comments. We conducted a linguistic analysis to analyze the language patterns associated with transitions in substance use, emphasizing the role of n-gram features in predicting future risk trajectories. Results Our results showed promise in predicting the escalation or de-escalation transition in risk levels, based on the historical data of Reddit users created on initial transition phases among drug-related subreddits, with an accuracy of 78.48% and an F1-score of 79.20%. We highlighted the vital predictive features, such as specific substance names and tools indicative of future risk escalations. Our linguistic analysis showed that terms linked with harm reduction strategies were instrumental in signaling de-escalation, whereas descriptors of frequent substance use were characteristic of escalating transitions. Conclusions This study sheds light on the complexities surrounding the gateway hypothesis of substance use through an examination of web-based behavior on Reddit. While certain findings validate the hypothesis, indicating a progression from lower-risk substances such as marijuana to higher-risk ones, a significant number of individuals did not show this transition. The research underscores the potential of using machine learning with social media analysis to predict substance use transitions. Our results point toward future directions for leveraging social media data in substance use research, underlining the importance of continued exploration before suggesting direct implications for interventions.

Publisher

JMIR Publications Inc.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3