Deep Persian sentiment analysis: Cross-lingual training for low-resource languages-Reference-Cited by-同舟云学术

Deep Persian sentiment analysis: Cross-lingual training for low-resource languages

Published:2020-12-02 Issue: Volume: Page:016555152096278
ISSN:0165-5515
Container-title:Journal of Information Science
language:en
Short-container-title:Journal of Information Science

Author:

Ghasemi Rouzbeh¹,Ashrafi Asli Seyed Arad¹,Momtazi Saeedeh¹^ORCID

Affiliation:

1. Computer Engineering Department, Amirkabir University of Technology, Iran

Abstract

With the advent of deep neural models in natural language processing tasks, having a large amount of training data plays an essential role in achieving accurate models. Creating valid training data, however, is a challenging issue in many low-resource languages. This problem results in a significant difference between the accuracy of available natural language processing tools for low-resource languages compared with rich languages. To address this problem in the sentiment analysis task in the Persian language, we propose a cross-lingual deep learning framework to benefit from available training data of English. We deployed cross-lingual embedding to model sentiment analysis as a transfer learning model which transfers a model from a rich-resource language to low-resource ones. Our model is flexible to use any cross-lingual word embedding model and any deep architecture for text classification. Our experiments on English Amazon dataset and Persian Digikala dataset using two different embedding models and four different classification networks show the superiority of the proposed model compared with the state-of-the-art monolingual techniques. Based on our experiment, the performance of Persian sentiment analysis improves 22% in static embedding and 9% in dynamic embedding. Our proposed model is general and language-independent; that is, it can be used for any low-resource language, once a cross-lingual embedding is available for the source–target language pair. Moreover, by benefitting from word-aligned cross-lingual embedding, the only required data for a reliable cross-lingual embedding is a bilingual dictionary that is available between almost all languages and the English language, as a potential source language.

Publisher

SAGE Publications

Subject

Library and Information Sciences,Information Systems

Link

http://journals.sagepub.com/doi/pdf/10.1177/0165551520962781

Reference40 articles.

1. Sentiment strength detection in short informal text

2. Sentiment strength detection for the social web

3. Three-way enhanced convolutional neural networks for sentence-level sentiment classification

Cited by 32 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Creating NFT-backed emoji art from user conversations on blockchain;Data Science and Management;2024-06

2. Unveiling Sentiment Insights in Smart Cities: Exploring the Role of Social Media;2024 8th International Conference on Smart Cities, Internet of Things and Applications (SCIoT);2024-05-14

3. LDPSA: A Large Dataset of Persian Sentiment Analysis;2024 10th International Conference on Web Research (ICWR);2024-04-24

4. Exploring Public Response to ChatGPT With Sentiment Analysis and Knowledge Mapping;IEEE Access;2024

5. Sentiment Analysis Based on Improved Transformer Model and Conditional Random Fields;IEEE Access;2024