Author:
Renuka Oladri,Radhakrishnan Niranchana
Abstract
The Bidirectional Encoder Representations from Transformers (BERT) model is used in this work to analyse sentiment on Twitter data. A Kaggle dataset of manually annotated and anonymized COVID-19-related tweets was used to refine the model. Location, tweet date, original tweet content, and sentiment labels are all included in the dataset. When compared to the Multinomial Naive Bayes (MNB) baseline, BERT's performance was assessed, and it achieved an overall accuracy of 87% on the test set. The results indicated that for negative feelings, the accuracy was 0.93, the recall was 0.84, and the F1-score was 0.88; for neutral sentiments, the precision was 0.86, the recall was 0.78, and the F1-score was 0.82; and for positive sentiments, the precision was 0.82, the recall was 0.94, and the F1-score was 0.88. The model's proficiency with the linguistic nuances of Twitter, including slang and sarcasm, was demonstrated. This study also identifies the flaws of BERT and makes recommendations for future research paths, such as the integration of external knowledge and alternative designs.
Publisher
Inventive Research Organization
Reference24 articles.
1. [1] Kaplan, Andreas M., and Michael Haenlein. "Users of the world, unite! The challenges and opportunities of Social Media." Business horizons 53, no. 1 (2010): 59-68.
2. [2] Liu, B. Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies, 5(1), (2012). 1-167.
3. [3] Pang, B., & Lee, L. Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2(1-2), (2008). 1-135.
4. [4] Tsytsarau, M., & Palpanas, T. Survey on mining subjective data on the web. Data Mining and Knowledge Discovery, 24(3), (2012). 478-514.
5. [5] Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.