Fine-tuning BERT, DistilBERT, XLM-RoBERTa and Ukr-RoBERTa models for sentiment analysis of ukrainian language reviews-Reference-Cited by-同舟云学术

Fine-tuning BERT, DistilBERT, XLM-RoBERTa and Ukr-RoBERTa models for sentiment analysis of ukrainian language reviews

Published:2024-06-28 Issue:AI.2024.29(2) Volume:29 Page:85-97
ISSN:2710-1673
Container-title:Artificial Intelligence
language:
Short-container-title:Stuc.intelekt

Author:

M Prytula^ORCID,

Abstract

Sentiment analysis is one of the crucial tasks of natural language processing, which includes recognizing emotions expressed in textual data from various fields of activity. Automated tonality detection impacts businesses and helps increase profits by analyzing customer sentiment and responding quickly to their level of satisfaction with products or services. Therefore, the development of tools that will allow qualitative classification of text sentiment is significant, considering that users leave many reviews on various social networks, platforms, and websites in today's world. The study examines the fine-tuning of BERT, DistilBERT, XLM-RoBERTa, and Ukr-RoBERTa models for sentiment analysis of reviews in the Ukrainian language, as transformer models demonstrate a better understanding of the context and show high efficiency in solving natural language processing tasks. The dataset used in this study comprised about 11,000 user comments in Ukrainian, covering a range of topics such as shops, restaurants, hotels, medical facilities, fitness clubs, and the provision of various services. The textual data was categorized into two classes: positive and negative. Following text preprocessing, the dataset was divided into training and test samples in an 80:20 ratio. The hyperparameters were selected to optimize the performance of the pre-trained models for comment sentiment classification, and their effectiveness was evaluated using metrics such as accuracy, recall, precision, and F1-score. The results show that DistilBERT requires significantly fewer computing resources and is faster than other models. The XLM-RoBERTa model achieved the highest accuracy of 91.32%. However, considering the time needed to train the model and all the classification metrics, Ukr-RoBERTa is the optimal choice.

Publisher

National Academy of Sciences of Ukraine (Co. LTD Ukrinformnauka) (Publications)

Reference25 articles.

1. 1. The importance of using AI-driven sentiment analysis in customer feedback [Electronic resource]. - Mode of access: https://moldstud.com/articles/p-the importance-of-using-ai-driven-sentiment-analysis-in customer-feedback

2. 2. Designing for Emotional Resonance in Software Interactions [Electronic resource]. - Mode of access: https://moldstud.com/articles/p-designing-for emotional-resonance-in-software-interactions

3. 3. Bonta, V., Kumaresh, N., Janardhan, N. (2018) A comprehensive study on lexicon based approaches for sentiment analysis. Asian Journal of Computer Science and Technology, 8(S2), 1-6.

4. 4. Nandwani, P., Verma, R. (2021) A review on sentiment analysis and emotion detection from text. Social Network Analysis and Mining, 11, 81. https://doi.org/10.1007/s13278-021-00776-6

5. 5. Pavitha, N., Pungliya, V., Raut, A., Bhonsle, R., Purohit, A., Patel, A., Shashidhar, R. (2022) Movie recommendation and sentiment analysis using machine learning. Global Transitions Proceedings, 3, 279-284. https://doi.org/10.1016/j.gltp.2022.03.012