Characterisation of COVID-19-Related Tweets in the Croatian Language: Framework Based on the Cro-CoV-cseBERT Model-Reference-Cited by-同舟云学术

Characterisation of COVID-19-Related Tweets in the Croatian Language: Framework Based on the Cro-CoV-cseBERT Model

Published:2021-11-06 Issue:21 Volume:11 Page:10442
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Babić Karlo^ORCID,Petrović Milan^ORCID,Beliga Slobodan^ORCID,Martinčić-Ipšić Sanda^ORCID,Matešić Mihaela^ORCID,Meštrović Ana^ORCID

Abstract

This study aims to provide insights into the COVID-19-related communication on Twitter in the Republic of Croatia. For that purpose, we developed an NL-based framework that enables automatic analysis of a large dataset of tweets in the Croatian language. We collected and analysed 206,196 tweets related to COVID-19 and constructed a dataset of 10,000 tweets which we manually annotated with a sentiment label. We trained the Cro-CoV-cseBERT language model for the representation and clustering of tweets. Additionally, we compared the performance of four machine learning algorithms on the task of sentiment classification. After identifying the best performing setup of NLP methods, we applied the proposed framework in the task of characterisation of COVID-19 tweets in Croatia. More precisely, we performed sentiment analysis and tracked the sentiment over time. Furthermore, we detected how tweets are grouped into clusters with similar themes across three pandemic waves. Additionally, we characterised the tweets by analysing the distribution of sentiment polarity (in each thematic cluster and over time) and the number of retweets (in each thematic cluster and sentiment class). These results could be useful for additional research and interpretation in the domains of sociology, psychology or other sciences, as well as for the authorities, who could use them to address crisis communication problems.

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/11/21/10442/pdf

Reference64 articles.

1. Risk Communication for Public Health Emergencies

2. Social media can have an impact on how we manage and investigate the COVID-19 pandemic

3. Infodemiology: the epidemiology of (mis)information

4. Machine learning on big data: Opportunities and challenges

5. COVID-19 Sensing: Negative Sentiment Analysis on Social Media in China via BERT Model

Cited by 17 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Characterizing Public Sentiments and Drug Interactions during COVID-19: A Pretrained Language Model and Network Analysis of Social Media Discourse (Preprint);2024-06-28

2. Hybrid Natural Language Processing Model for Sentiment Analysis during Natural Crisis;Electronics;2024-05-20

3. Machine Learning and Deep Learning Sentiment Analysis Models: Case Study on the SENT-COVID Corpus of Tweets in Mexican Spanish;Informatics;2024-04-23

4. Recursively Autoregressive Autoencoder for Pyramidal Text Representation;IEEE Access;2024

5. First Insight into Social Media User Sentiment Spreading Potential to Enhance the Conceptual Model for Disinformation Detection;Data Science—Analytics and Applications;2024