Author:
Ali Kawo Mohammed,Muhammad Garba,Gabi Danlami,Sule Argungu Musa
Abstract
Extracting subjective data from online user generated text documents is made quite easy with the use of sentiment analysis. For a classification task different individual algorithms are applied to a review dataset in which most classifiers produce accurate results while others produce limited and inaccurate predictions. This research is to evaluate various machine learning algorithms for online dataset classification, where same set of data will be used to test four different machine learning algorithms: Naive Bayes, Support Vector machine, K-nearest neighbor and Decision tree. In order to determine which machine learning model will perform best in sentiment analysis as a constant issue. In this research, our primary goal is to identify the most effective machine learning model for sentiment analysis of English texts among the aforementioned classifiers. Their robustness will be tested and classified with an imbalanced dataset Kaggle.com a Machine learning repository. The dataset will first undergo data preprocessing in order to enable analysis, and then feature extraction for the base classifiers performance and accuracy which will be carried out in Jupyter notebook from Anaconda. Each machine learning algorithm performance scores will be calculated for higher accuracy using confusion matrix, F1-score, precision and recall respectively.
Publisher
International Journal of Innovative Science and Research Technology
Reference32 articles.
1. Agustini, T. (2021). Sentiment Analysis on Social Media using Machine Learning-Based Approach. June, 544437.
2. Arya, P., Bhagat, A., & Nair, R. (2019). Improved Performance of Machine Learning Algorithms via Ensemble Learning Methods of Sentiment Analysis. 10(2), 110–116.
3. Bahwari. (2019). Sentiment Analysis Using Random Forest Algorithm - Online Social Media Based. Journal Of Information Technology AND ITS UTILIZATION, 2(2), 29–33. https://www.researchgate.net/publication/338548518_SENTIMENT_ANALYSIS_USING_RANDOM_FOREST_ALGORITHM_ONLINE_SOCIAL_MEDIA_BASED
4. Feng, W., Gou, J., Fan, Z., & Chen, X. (2023). An ensemble machine learning approach for classification tasks using feature generation. Connection Science, 35(1). https://doi.org/10.1080/ 09540091.2023.2231168
5. George, S., & Srividhya, V. (2022). Performance Evaluation of Sentiment Analysis on Balanced and Imbalanced Dataset Using Ensemble Approach. Indian Journal of Science and Technology, 15(17), 790–797. https://doi.org/10.17485/ijst/v15i17.2339