Using Reddit data to investigate perspectives on the COVID-19 pandemic using natural language processing: a comparative study of the US, the UK, Canada and Australia (Preprint)

Author:

Hu MengkeORCID,Conway Mike

Abstract

BACKGROUND

Since COVID-19 was declared a pandemic by the World Health Organization (WHO) on March 11, 2020, the disease has had an unprecedented impact worldwide, with, as of December 21, 2021, more than 276 million confirmed cases and 5.3 million deaths[1]. Social media such as Reddit can serve as a resource for enhancing situational awareness, particularly regarding monitoring public attitudes and behavior during the crisis. Insights gained can then be utilized to better understand public attitudes and behaviors during the COVID-19 crisis, and to support communication and health promotion messaging.

OBJECTIVE

With this work, we compare public attitudes towards the 2020/2021 COVID-19 pandemic across four predominantly English-speaking countries (the United States, the United Kingdom, Canada, and Australia) using data derived from the social media platform Reddit.

METHODS

We utilized a natural language processing method called topic modeling (more specifically Latent Dirichlet Allocation). Topic modeling is a popular unsupervised learning technique that can be used to automatically in- fer topics (i.e. semantically-related categories) from a large corpus of text. We derived our data from six country-specific, COVID-19-related subreddits (r/CoronavirusAustralia, r/CoronavirusDownunder, r/CoronavirusCanada, r/CanadaCoronavirus, r/CoronavirusUK, r/coronavirusus). We used topic modeling methods to investigate and compare topics of concern for each country.

RESULTS

From the Reddit data we found that (1) the volume of posting declined consistently across all four countries during the study period (Feb. 2020 to Nov. 2020); (2) during lockdown events, the volume of posts peaked; and (3) the UK and Australian subreddits contained much more policy discussion – and less conspiratorial content – than the US or Canadian subreddits.

CONCLUSIONS

This work demonstrated that (a) there were key differences between salient topics discussed across the four countries, and (b) Reddit data has the potential to provide insights not readily apparent in survey-based approaches.

Publisher

JMIR Publications Inc.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3