Affiliation:
1. Department of Computer Science and Information Technology, University of the District of Columbia, Washington, DC 20008, USA
2. Department of Mathematics and Statistics, University of the District of Columbia, Washington, DC 20008, USA
3. Department of Computer Science, Morgan State University, Baltimore, MD 21251, USA
4. Department of Mechanical and Biomedical Engineering, University of the District of Columbia, Washington, DC 20008, USA
5. Department of Electrical and Computer Engineering, University of the District of Columbia, Washington, DC 20008, USA
Abstract
Popular social media platforms, such as Twitter, have become an excellent source of information with their swift information dissemination. Individuals with different backgrounds convey their opinions through social media platforms. Consequently, these platforms have become a profound instrument for collecting enormous datasets. We believe that compiling, organizing, exploring, and analyzing data from social media platforms, such as Twitter, can offer various perspectives to public health organizations and decision makers in identifying factors that contribute to vaccine hesitancy. In this study, public tweets were downloaded daily from Tweeter using the Tweeter API. Before performing computation, the tweets were preprocessed and labeled. Vocabulary normalization was based on stemming and lemmatization. The NRCLexicon technique was deployed to convert the tweets into ten classes: positive sentiment, negative sentiment, and eight basic emotions (joy, trust, fear, surprise, anticipation, anger, disgust, and sadness). t-test was used to check the statistical significance of the relationships among the basic emotions. Our analysis shows that the p-values of joy–sadness, trust–disgust, fear–anger, surprise–anticipation, and negative–positive relations are close to zero. Finally, neural network architectures, including 1DCNN, LSTM, Multiple-Layer Perceptron, and BERT, were trained and tested in a COVID-19 multi-classification of sentiments and emotions (positive, negative, joy, sadness, trust, disgust, fear, anger, surprise, and anticipation). Our experiment attained an accuracy of 88.6% for 1DCNN at 1744 s, 89.93% accuracy for LSTM at 27,597 s, while MLP achieved an accuracy of 84.78% at 203 s. The study results show that the BERT model performed the best, with an accuracy of 96.71% at 8429 s.
Subject
Health, Toxicology and Mutagenesis,Public Health, Environmental and Occupational Health