Deep learning-based sentiment classification in Amharic using multi-lingual datasets-Reference-Cited by-同舟云学术

Deep learning-based sentiment classification in Amharic using multi-lingual datasets

Published:2023 Issue:4 Volume:20 Page:1459-1481
ISSN:1820-0214
Container-title:Computer Science and Information Systems
language:en
Short-container-title:COMSIS J

Author:

Tesfagergish Senait Gebremichael¹,Damasevicius Robertas¹,Kapociūtė-Dzikienė Jurgita²

Affiliation:

1. Department of Software Engineering, Kaunas University of Technology, Kaunas, Lithuania

2. Department of Applied Informatics, Vytautas Magnus University, Kaunas, Lithuania

Abstract

The analysis of emotions expressed in natural language text, also known as sentiment analysis, is a key application of natural language processing (NLP). It involves assigning a positive, negative (sometimes also neutral) value to opinions expressed in various contexts such as social media, news, blogs, etc. Despite its importance, sentiment analysis for under-researched languages like Amharic has not received much attention in NLP yet due to the scarcity of resources required to train such methods. This paper examines various deep learning methods such as CNN, LSTM, FFNN, BiLSTM, and transformers, as well as memory-based methods like cosine similarity, to perform sentiment classification using the word or sentence embedding techniques. This research includes training and comparing mono-lingual or cross-lingual models using social media messages in Amharic on Twitter. The study concludes that the lack of training data in the target language is not a significant issue since the training data 1) can be machine translated from other languages using machine translation as a data augmentation technique [33], or 2) cross-lingual models can capture the semantics of the target language, even when trained on another language (e.g., English). Finally, the FFNN classifier, which combined the sentence transformer and the cosine similarity method, proved to be the best option for both 3-class and 2-class sentiment classification tasks, achieving 62.0% and 82.2% accuracy, respectively.

Publisher

National Library of Serbia

Subject

General Computer Science

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Sentiment Analysis for Amharic-English Code-Mixed Sociopolitical Posts Using Deep Learning;2024-08-12

2. Multimodal Hinglish Tweet Dataset for Deep Pragmatic Analysis;Data;2024-02-15

3. Sentiment Analysis in Low-Resource Settings: A Comprehensive Review of Approaches, Languages, and Data Sources;IEEE Access;2024