Machine Learning-based Voting Classifier for Improving Sentiment Analysis on Twitter Data-Reference-Cited by-同舟云学术

Machine Learning-based Voting Classifier for Improving Sentiment Analysis on Twitter Data

Published:2024-08-12 Issue: Volume:5 Page:1-9
ISSN:2960-2238
Container-title:Transactions on Computer Science and Intelligent Systems Research
language:
Short-container-title:TCSISR

Author:

Li Huatao

Abstract

As the number of individuals sharing their thoughts on Twitter continues to grow, comprehending the underlying sentiment behind these tweets becomes increasingly crucial for researchers. To identify the optimal model capable of accurately distinguishing tweet sentiment, the author uses a dataset published in 2022, containing tweet texts annotated with corresponding sentiments. Six basic machine learning classification methods are used for model training: Logistic Regression, Naïve Bayes Classifier, Support Vector Classifier, Decision Tree Classifier, Random Forest Classifier, and K-Nearest Neighbors Classifier. Subsequently, the author assesses the trained models. Through the validation, the author finds that the Logistic Regression, Support Vector Classifier, and Random Forest Classifier perform the highest accuracy and F1-score, and the differences between these three models are small. To improve the model, the author votes the best three models together to build a new model. This model’s accuracy and F1-score are better than all the basic models, and the accuracy and F1-score have all reached 71.6%. The research shows the differences between each model and the best model when distinguishing between positive tweets, neutral tweets, and negative tweets.

Publisher

Warwick Evans Publishing

Reference12 articles.

1. Boon-Itt, Sakun, and Yukolpat Skunkan. Public perception of the COVID-19 pandemic on Twitter: sentiment analysis and topic modeling study. JMIR Public Health and Surveillance, 2020, 6(4): e21978.

2. La Gatta, Valerio, et al. COVID-19 Sentiment Analysis Based on Tweets. IEEE Intelligent Systems, 2023 38(3): 51-55.

3. Qi, Yuxing, and Zahratu Shabrina. Sentiment analysis using Twitter data: a comparative application of lexicon-and machine-learning-based approach. Social Network Analysis and Mining, 2023, 13(1): 31.

4. Mostafa, Lamiaa. Egyptian student sentiment analysis using Word2vec during the coronavirus (Covid-19) pandemic. Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2020, 2021.

5. Al Amrani, Yassine, Mohamed Lazaar, and Kamal Eddine El Kadiri. Random forest and support vector machine based hybrid approach to sentiment analysis. Procedia Computer Science, 2018, 127: 511-520.