Author:
Abdullah Nur Atiqah Sia,Rusli Nur Ida Aniza
Abstract
With the explosive growth of social media, the online community can freely express their opinions without disclosing their identities. People with hidden agendas can easily post fake opinions to discredit target products, services, politicians, or organizations. With these big data, monitoring opinions and distilling their sentiments remain a formidable task because of the proliferation of diverse sites with a large volume of opinions that are portrayed in multilingual. Therefore, this paper aims to provide a systematic literature review on multilingual sentiment analysis, which summarises the common languages supported in multilingual sentiment analysis, pre-processing techniques, existing sentiment analysis approaches, and evaluation models that have been used for multilingual sentiment analysis. By following the systematic literature review, the findings revealed, most of the models supported two languages, and English is seen as the most used language in sentiment analysis studies. None of the reviewed literature has catered the combination of languages for English, Chinese, Malay, and Hindi language on multilingual sentiment analysis. The common pre-processing techniques for the multilingual domain are tokenization, normalization, capitalization, N-gram, and machine translation. Meanwhile, the sentiment analysis classification techniques for multilingual sentiment are hybrid sentiment analysis, which includes localized language analysis, unsupervised topic clustering, and then followed by multilingual sentiment analysis. In terms of evaluation, most of the studies used precision, recall, and accuracy as the benchmark for the results.
Publisher
Universiti Putra Malaysia
Subject
General Earth and Planetary Sciences,General Environmental Science
Reference79 articles.
1. Abdel-Hady, M., Mansour, R., & Ashour, A. (2014, August 24). Cross-lingual twitter polarity detection via projection across word-aligned corpora. In Proceedings of WISDOM (pp. 1-12). New York, USA.
2. Al-Azani, S., & El-Alfy, E. S. M. (2017). Using word embedding and ensemble learning for highly imbalanced data sentiment analysis in short arabic text. Procedia Computer Science, 109, 359-366.
3. Al-Kabi, M. N., Hailat, T. M., Al-Shawakfa, E. M., & Alsmadi, I. M. (2013). Evaluating English to Arabic machine translation using BLEU. International Journal of Advanced Computer Science and Applications (IJACSA), 4(1), 66-73.
4. Alsaeedi, A. (2019). EFTSA: Evaluation framework for Twitter sentiment analysis. Journal of Software, 14(1), 24-35. doi: 10.17706/jsw.14.1.24-35
5. Alsaleem, S. (2011). Automated Arabic text categorization using SVM and NB. International Arab Journal of e-Technology, 2(2), 124-128.
Cited by
13 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献