BACKGROUND
While vaccination against the coronavirus (COVID–19) lasts, Twitter has become one of the social media platforms used to generate discussions about the COVID–19 vaccination. These types of discussions most times lead to a compromise of public confidence towards the vaccine. The text-based data generated by these discussions are used by researchers to extract topics and perform sentiment analysis on provincial, country, or continent level without considering the local communities.
OBJECTIVE
The aim of this study is to use geoclustering of Twitter posts to inform city-level variations in sentiments toward COVID–19 vaccine-related topics in the three largest South African cities (Cape Town, Durban, and Johannesburg).
METHODS
We generated a dataset and processed (n=25,000) COVID–19 vaccine-related tweets in South Africa from January 2021 to August 2021 using the academic researcher Twitter Application Programming Interface (API) with keywords like vaccine, vaccination, AstraZeneca, Oxford-AstraZeneca, VaccineToSaveSouthAfrica, JohnsonJohnson, and pfizer. Tweets were mapped with their geolocation. The Latent Dirichlet Allocation was used to identify frequently discussed topics across the cities. Senti- ments (negative, neutral, and positive) scores were assigned using the Valence Aware Dictionary with Support Vector Machine classification algorithm.
RESULTS
The number of new COVID–19 cases significantly positively correlated with the number of Tweets in South Africa (Corr=0.462, P<.001). Out of the 10 topics identified from the tweets, 2 were about the COVID–19 vaccines: uptake and supply, respectively. The intensity of the sentiments score for the two topics was associated with the total number of vaccines administered in South Africa (P<.001). Discussions regarding the two topics showed higher intensity scores for neutral sentiment class (P=.015) than for other sentiment classes. Additionally, the intensity of the discussions for the two topics was associated with the total number of vaccines administered, new cases, deaths, and recoveries across the three cities (P<.001). The sentiment score for the most discussed topic, vaccine uptake, differed across the three cities, with (P=.003), (P=.002), and (P<.001) for positive, negative, and neutral sentiments classes, respectively.
CONCLUSIONS
The outcome of this research showed that geolocation clustering of Twitter posts can be used to better analyze the sentiments towards COVID–19 vaccines at the local level. This can provide additional city–level information to health policy and decision-making regarding COVID–19 vaccine hesitancy.