CLASENTI-Reference-Cited by-同舟云学术

CLASENTI

Published:2018-12-31 Issue:4 Volume:17 Page:1-28
ISSN:2375-4699
Container-title:ACM Transactions on Asian and Low-Resource Language Information Processing
language:en
Short-container-title:ACM Trans. Asian Low-Resour. Lang. Inf. Process.

Author:

Hamdi Ali¹^ORCID,Shaban Khaled²,Zainal Anazida¹

Affiliation:

1. Faculty of Computing, UTM, Malaysia

2. Smart Pivoting QSTP LLC, Doha, Qatar

Abstract

Arabic text sentiment analysis suffers from low accuracy due to Arabic-specific challenges (e.g., limited resources, morphological complexity, and dialects) and general linguistic issues (e.g., fuzziness, implicit sentiment, sarcasm, and spam). The limited resources problem requires efforts to build new and improved Arabic corpora and lexica. We propose a class-specific sentiment analysis (CLASENTI) framework. The framework includes a new annotation approach to build multi-faceted Arabic corpus and lexicon allowing for simultaneous annotation of different facets, including domains, dialects, linguistic issues, and polarity strengths. Each of these facets has multiple classes (e.g., the nine classes representing dialects found in the Arab world). The new corpus and lexicon annotations facilitate the development of new class-specific classification models and polarity strength calculation. For the new sentiment classification models, we propose a hybrid model combining corpus-based and lexicon-based models. The corpus-based model has two interrelated phases to build; (1) full-corpus classification models for all facets; and (2) class-specific models trained on filtered subsets of the corpus according to the performances of the full-corpus models. To calculate polarity strengths, the lexicon-based model filters the annotated lexicon based on the specific classes of the domain and dialect. As a case study, we collect and annotate 15274 reviews from various sources, including surveys, Facebook comments, and Twitter posts, pertaining to governmental services. In addition, we develop a new web-based application to apply the proposed framework on the case study. CLASENTI framework reaches up to 95% accuracy and 93% F1-Score surpassing the best-known sentiment classifiers implemented in Scikit-learn library that achieve 82% accuracy and 81% F1-Score for Arabic when tested on the same dataset.

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/3209885

Reference89 articles.

1. Effect of training set size on SVM and Naïve Bayes for Twitter sentiment analysis

2. SAMAR: Subjectivity and sentiment analysis for Arabic social media

Cited by 16 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Ensemble Stacking Model for Sentiment Analysis of Emirati and Arabic Dialects;Journal of King Saud University - Computer and Information Sciences;2023-09

2. Sentiment Analysis of Emirati Dialect;Big Data and Cognitive Computing;2022-05-17

3. Empirical Evaluation of Shallow and Deep Learning Classifiers for Arabic Sentiment Analysis;ACM Transactions on Asian and Low-Resource Language Information Processing;2022-01-31

4. C-SAR: Class-Specific and Adaptive Recognition for Arabic Handwritten Cheques;Advances on Intelligent Informatics and Computing;2022

5. A Systematic Review for Sentiment Analysis of Arabic Dialect Texts Researches;Proceedings of International Conference on Emerging Technologies and Intelligent Systems;2021-12-03