Abstract
The utilization of sentiment analysis has gained significant importance as a valuable method for obtaining meaningful insights from textual data. The research progress in languages such as English and Chinese has been notable. However, there is a noticeable dearth of attention towards creating tools for sentiment analysis in the Bangla language. Currently, datasets are limited for Bangla sentiment analysis, especially balanced datasets capturing both binary and multiclass sentiment for e-commerce applications. This paper introduces a new sentiment analysis dataset from the popular Bangladeshi e-commerce site “Daraz”. The dataset contains 1000 reviews across 5 product categories, with both binary (positive/negative) and multiclass (very positive, positive, negative, very negative) sentiment labels manually annotated by native Bangla speakers. Reviews were collected using an organized process, and labels were assigned based on standardized criteria to ensure accuracy. In addition, a benchmark evaluation of the performance achieved by Machine Learning and Deep Learning algorithms on this dataset is also provided. The new dataset can aid research on multiclass and binary Bangla sentiment analysis utilizing both machine learning, deep learning, and Large Language Models. It can aid e-commerce platforms in analysing nuanced user opinions and emotions from online reviews. The utilization of categorized product reviews also facilitates research in the field of text categorization.
Reference31 articles.
1. D. Khurana, A. Koli, K. Khatter, and S. Singh, "Natural language processing: State of the art, current trends and challenges," Multimedia tools and applications, vol. 82, no. 3, pp. 3713-3744, 2023.
2. W. Medhat, A. Hassan, and H. Korashy, "Sentiment analysis algorithms and applications: A survey," Ain Shams engineering journal, vol. 5, no. 4, pp. 1093-1113, 2014.
3. R. A. Tuhin, B. K. Paul, F. Nawrine, M. Akter, and A. K. Das, "An automated system of sentiment analysis from Bangla text using supervised learning techniques," in 2019 IEEE 4th International Conference on Computer and Communication Systems (ICCCS), 2019: IEEE, pp. 360-364.
4. T. Al Mahmud, S. Sultana, T. I. Chowdhury, and F. R. Anando, "A New Approach to Analysis of Public Sentiment on Padma Bridge in Bangla Text," in 2022 4th International Conference on Sustainable Technologies for Industry 4.0 (STI), 2022: IEEE, pp. 1-6.
5. M. A. Hasan, S. Das, A. Anjum, F. Alam, A. Anjum, A. Sarker, and S. R. H. Noori, "Zero-and Few-Shot Prompting with LLMs: A Comparative Study with Fine-tuned Models for Bangla Sentiment Analysis," arXiv preprint arXiv:2308.10783, 2023.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献