Author:
Xu Yuemei,Cao Han,Du Wanze,Wang Wenqing
Abstract
AbstractCross-lingual sentiment analysis (CLSA) leverages one or several source languages to help the low-resource languages to perform sentiment analysis. Therefore, the problem of lack of annotated corpora in many non-English languages can be alleviated. Along with the development of economic globalization, CLSA has attracted much attention in the field of sentiment analysis and the last decade has seen a surge of researches in this area. Numerous methods, datasets and evaluation metrics have been proposed in the literature, raising the need for a comprehensive and updated survey. This paper fills the gap by reviewing the state-of-the-art CLSA approaches from 2004 to the present. This paper teases out the research context of cross-lingual sentiment analysis and elaborates the following methods in detail: (1) The early main methods of CLSA, including those based on Machine Translation and its improved variants, parallel corpora or bilingual sentiment lexicon; (2) CLSA based on cross-lingual word embedding; (3) CLSA based on multi-BERT and other pre-trained models. We further analyze their main ideas, methodologies, shortcomings, etc., and attempt to reach a conclusion on the coverage of languages, datasets and their performance. Finally, we look into the future development of CLSA and the challenges facing the research area.
Funder
Fundamental Research Funds for Central Universities of the Central South University
Publisher
Springer Science and Business Media LLC
Subject
Computer Science Applications,Computational Mechanics
Reference91 articles.
1. Yan Q, David A, Evans JG, Gergory G (2004) Mining multi-lingual options through classification and translation. In: Proceeding of AAAI. Menlo Park, pp. 354–362. AAAI
2. Wan X (2008) Using bilingual knowledge and ensemble techniques for unsupervised chinese sentiment analysis. In: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pp. 553–561. ACL
3. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. In: Proceedings of ACM SIGIR Conference on RDIR, pp. 1301–37813. Proceedings of Workshop at ICLR
4. Balahur A, Turchi M (2014) Comparative experiments using supervised learning and machine translation for multi-lingual sentiment analysis. Comput Speech Lang 28(3):56–75
5. Hajmohammadi MS, Ibrahimn AR, Selamat (2015) Graph-based semi-supervised learning for cross-lingual sentiment classification. In: Proceedings of ACIIDS, pp. 97–106. Springer
Cited by
24 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献