A Semi-supervised Approach for Sentiment Analysis of Arab(ic+izi) Messages: Application to the Algerian Dialect

Author:

Guellil Imane,Adeel Ahsan,Azouaou Faical,Benali Fodil,Hachani Ala-Eddine,Dashtipour Kia,Gogate Mandar,Ieracitano Cosimo,Kashani Reza,Hussain Amir

Abstract

AbstractIn this paper, we propose a semi-supervised approach for sentiment analysis of Arabic and its dialects. This approach is based on a sentiment corpus, constructed automatically and reviewed manually by Algerian dialect native speakers. This approach consists of constructing and applying a set of deep learning algorithms to classify the sentiment of Arabic messages as positive or negative. It was applied on Facebook messages written in Modern Standard Arabic (MSA) as well as in Algerian dialect (DALG, which is a low resourced-dialect, spoken by more than 40 million people) with both scripts Arabic and Arabizi. To handle Arabizi, we consider both options: transliteration (largely used in the research literature for handling Arabizi) and translation (never used in the research literature for handling Arabizi). For highlighting the effectiveness of a semi-supervised approach, we carried out different experiments using both corpora for the training (i.e. the corpus constructed automatically and the one that was reviewed manually). The experiments were done on many test corpora dedicated to MSA/DALG, which were proposed and evaluated in the research literature. Both classifiers are used, shallow and deep learning classifiers such as Random Forest (RF), Logistic Regression(LR) Convolutional Neural Network (CNN) and Long short-term memory (LSTM). These classifiers are combined with word embedding models such as Word2vec and fastText that were used for sentiment classification. Experimental results (F1 score up to 95% for intrinsic experiments and up to 89% for extrinsic experiments) showed that the proposed system outperforms the existing state-of-the-art methodologies (the best improvement is up to 25%).

Publisher

Springer Science and Business Media LLC

Reference90 articles.

1. Liu B. Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol. 2012;5(1):1–167.

2. Taboada M, Brooke J, Tofiloski M, Voll K, Stede M. Lexicon-based methods for sentiment analysis. Comput linguist. 2011;37(2):267–307.

3. Maas AL, Daly RE, Pham PT. Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies-volume 1, pp. 142–150. Association for computational linguistics

4. Guellil I, Boukhalfa K (2015) Social big data mining: a survey focused on opinion mining and sentiments analysis. In: Programming and systems (ISPS), 2015 12th international symposium on, pp. 1–10. IEEE

5. Guellil I, Faical A (2017) Bilingual lexicon for algerian arabic dialect treatment in social media. In: WiNLP: Women & Underrepresented Minorities in Natural Language Processing (co-located with ACL 2017). http://www.winlp.org/wp-content/uploads/2017/final_papers_2017/92_Paper.pdf

Cited by 25 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Natural Language Processing for Arabic Sentiment Analysis: A Systematic Literature Review;IEEE Transactions on Big Data;2024-10

2. An analysis of customer perception using lexicon-based sentiment analysis of Arabic Texts framework;Heliyon;2024-06

3. Advancements in Sentiment Analysis for the Algerian Dialect: A Comprehensive Review;2024 6th International Conference on Pattern Analysis and Intelligent Systems (PAIS);2024-04-24

4. Comparative Analysis of Machine Learning Algorithms for Arabic Sentiment Analysis on Imbalanced Social Media Data;2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS);2024-01-28

5. Word Embedding as a Semantic Feature Extraction Technique in Arabic Natural Language Processing: An Overview;The International Arab Journal of Information Technology;2024

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3