Fine-tuning BERT, DistilBERT, XLM-RoBERTa and Ukr-RoBERTa models for sentiment analysis of ukrainian language reviews

Author:

M PrytulaORCID,

Abstract

Sentiment analysis is one of the crucial tasks of natural language processing, which includes recognizing emotions expressed in textual data from various fields of activity. Automated tonality detection impacts businesses and helps increase profits by analyzing customer sentiment and responding quickly to their level of satisfaction with products or services. Therefore, the development of tools that will allow qualitative classification of text sentiment is significant, considering that users leave many reviews on various social networks, platforms, and websites in today's world. The study examines the fine-tuning of BERT, DistilBERT, XLM-RoBERTa, and Ukr-RoBERTa models for sentiment analysis of reviews in the Ukrainian language, as transformer models demonstrate a better understanding of the context and show high efficiency in solving natural language processing tasks. The dataset used in this study comprised about 11,000 user comments in Ukrainian, covering a range of topics such as shops, restaurants, hotels, medical facilities, fitness clubs, and the provision of various services. The textual data was categorized into two classes: positive and negative. Following text preprocessing, the dataset was divided into training and test samples in an 80:20 ratio. The hyperparameters were selected to optimize the performance of the pre-trained models for comment sentiment classification, and their effectiveness was evaluated using metrics such as accuracy, recall, precision, and F1-score. The results show that DistilBERT requires significantly fewer computing resources and is faster than other models. The XLM-RoBERTa model achieved the highest accuracy of 91.32%. However, considering the time needed to train the model and all the classification metrics, Ukr-RoBERTa is the optimal choice.

Publisher

National Academy of Sciences of Ukraine (Co. LTD Ukrinformnauka) (Publications)

Reference25 articles.

1. 1. The importance of using AI-driven sentiment analysis in customer feedback [Electronic resource]. - Mode of access: https://moldstud.com/articles/p-the importance-of-using-ai-driven-sentiment-analysis-in customer-feedback

2. 2. Designing for Emotional Resonance in Software Interactions [Electronic resource]. - Mode of access: https://moldstud.com/articles/p-designing-for emotional-resonance-in-software-interactions

3. 3. Bonta, V., Kumaresh, N., Janardhan, N. (2018) A comprehensive study on lexicon based approaches for sentiment analysis. Asian Journal of Computer Science and Technology, 8(S2), 1-6.

4. 4. Nandwani, P., Verma, R. (2021) A review on sentiment analysis and emotion detection from text. Social Network Analysis and Mining, 11, 81. https://doi.org/10.1007/s13278-021-00776-6

5. 5. Pavitha, N., Pungliya, V., Raut, A., Bhonsle, R., Purohit, A., Patel, A., Shashidhar, R. (2022) Movie recommendation and sentiment analysis using machine learning. Global Transitions Proceedings, 3, 279-284. https://doi.org/10.1016/j.gltp.2022.03.012

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3