Improving the Performance of Sentiment Analysis by Ensemble Hybrid Learning Algorithm With NLP And Cascaded Feature Extraction-Reference-Cited by-同舟云学术

Improving the Performance of Sentiment Analysis by Ensemble Hybrid Learning Algorithm With NLP And Cascaded Feature Extraction

Published:2023-03-30 Issue:1 Volume:35 Page:125-141
ISSN:2636-8277
Container-title:International Journal of Advances in Engineering and Pure Sciences
language:
Short-container-title:

Author:

ALTINEL GİRGİN Ayşe Berna¹^ORCID,ŞAHİN Sema¹^ORCID

Affiliation:

1. MARMARA UNIVERSITY, FACULTY OF TECHNOLOGY

Abstract

Sentiment analysis is a challenging problem in Natural Language Processing since every language has its own character within several difficulties such as ambiguity, synonymy, negative suffixes…etc. Since words with ambiguity can have different sentiment scores depending on the meaning they have in their corresponding context, we accomplished a study on Turkish language to determine whether the polarity scores of these polysemous words may differ according to their meaning. For a word with ambiguity, we first made a polarity calculation module to calculate its polarity score. This way, we calculated the polarity scores of 100 Turkish polysemous words. Then, since negation directly affects the correct meaning of the word in the sentiment analysis, a negation handler module is also implemented. After that, we prepared a sentiment polarity corpus which consists of 159,876 Turkish words including 100 Turkish polysemous words. Actually, the main purpose of this study is to detect sentiment polarity of Turkish texts by considering and building a specialized module for polysemous words. In short, we built a system for Turkish sentiment polarity detection task including these modules: 1) Pre-processing, 2) Polarity Calculation Module, 3) Negation Handling Module, 4) Feature Generation Module, and 5) Classification Module. According to our knowledge, this is the first study which includes all of these modules in one Turkish sentiment analysis task. Finally, we conducted this corpus using an ensemble hybrid regularized learning algorithm on two self-collected Twitter-datasets. Experimental results show that the suggested approach improves the classification performance on Turkish sentiment analysis task.

Funder

TÜBİTAK

Publisher

Marmara University

Subject

General Medicine

Reference35 articles.

1. [1] Navigli, R., Word sense disambiguation: A survey. ACM Comput Surv, 41(2), 1-69, (2009).

2. [2] Boyd-Graber, J., Blei, D. & Zhu, X.A. Topic model for word sense disambiguation. In Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), Prague, (2007).

3. [3] Açıkgöz, O., Gürkan, A. T., Ertopçu, B., Topsakal, O., Özenç, B., Kanburoğlu, A. B., & Yıldız, O. T. All-words word sense disambiguation for Turkish. In International Conference on Computer Science and Engineering (UBMK), Antalya, Turkey, (2017).

4. [4] Orhan, Z., & Altan, Z. Effective features for disambiguation of Turkish verbs. Int J. Comp and Inf Eng, 1(7), 2264-2268, (2007).

5. [5] Gezici, G., & Yanıkoğlu, B. Sentiment analysis in Turkish. Turkish natural language processing, 255-271, (2018).

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. So-haTRed: A Novel Hybrid System for Turkish Hate Speech Detection in Social Media With Ensemble Deep Learning Improved by BERT and Clustered-Graph Networks;IEEE Access;2024