Multilingual emoji prediction using BERT for sentiment analysis-Reference-Cited by-同舟云学术

Multilingual emoji prediction using BERT for sentiment analysis

Published:2020-09-21 Issue:3 Volume:16 Page:265-280
ISSN:1744-0084
Container-title:International Journal of Web Information Systems
language:en
Short-container-title:IJWIS

Author:

Tomihira Toshiki,Otsuka Atsushi,Yamashita Akihiro,Satoh Tetsuji

Abstract

PurposeRecently, Unicode has been standardized with the penetration of social networking services, the use of emojis has become common. Emojis, as they are also known, are most effective in expressing emotions in sentences. Sentiment analysis in natural language processing manually labels emotions for sentences. The authors can predict sentiment using emoji of text posted on social media without labeling manually. The purpose of this paper is to propose a new model that learns from sentences using emojis as labels, collecting English and Japanese tweets from Twitter as the corpus. The authors verify and compare multiple models based on attention long short-term memory (LSTM) and convolutional neural networks (CNN) and Bidirectional Encoder Representations from Transformers (BERT).Design/methodology/approachThe authors collected 2,661 kinds of emoji registered as Unicode characters from tweets using Twitter application programming interface. It is a total of 6,149,410 tweets in Japanese. First, the authors visualized a vector space produced by the emojis by Word2Vec. In addition, the authors found that emojis and similar meaning words of emojis are adjacent and verify that emoji can be used for sentiment analysis. Second, it involves entering a line of tweets containing emojis, learning and testing with that emoji as a label. The authors compared the BERT model with the conventional models [CNN, FastText and Attention bidirectional long short-term memory (BiLSTM)] that were high scores in the previous study.FindingsVisualized the vector space of Word2Vec, the authors found that emojis and similar meaning words of emojis are adjacent and verify that emoji can be used for sentiment analysis. The authors obtained a higher score with BERT models compared to the conventional model. Therefore, the sophisticated experiments demonstrate that they improved the score over the conventional model in two languages. General emoji prediction is greatly influenced by context. In addition, the score may be lowered due to a misunderstanding of meaning. By using BERT based on a bi-directional transformer, the authors can consider the context.Practical implicationsThe authors can find emoji in the output words by typing a word using an input method editor (IME). The current IME only considers the most latest inputted word, although it is possible to recommend emojis considering the context of the inputted sentence in this study. Therefore, the research can be used to improve IME performance in the future.Originality/valueIn the paper, the authors focus on multilingual emoji prediction. This is the first attempt of comparison at emoji prediction between Japanese and English. In addition, it is also the first attempt to use the BERT model based on the transformer for predicting limited emojis although the transformer is known to be effective for various NLP tasks. The authors found that a bidirectional transformer is suitable for emoji prediction.

Publisher

Emerald

Subject

Computer Networks and Communications,Information Systems

Reference35 articles.

1. The dabblers at SemEval-2018 task 2: multilingual emoji prediction,2018

2. Are emojis predictable?,2017

3. Multi-task emoji learning,2018

4. Multimodal emoji prediction,2018

5. Interpretable emoji prediction via label-wise attention LSTMs,2018

Cited by 28 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. XAI in geographic analysis of innovation: Evaluating proximity factors in the innovation networks of Chinese technology companies through web-based data;Applied Geography;2024-10

2. B-TTDb: A Database of Turkish Tweets for Predicting the Top One Hundred Emojis;ACM Transactions on the Web;2024-07-24

3. Sentiment analysis of online reviews of energy-saving products based on transfer learning and LBBA model;Journal of Environmental Management;2024-06

4. Comparative analysis of Deep Learning and Machine Learning algorithms for emoji prediction from Arabic text;Social Network Analysis and Mining;2024-03-25

5. Multimodal Sentiments: Unraveling Text and Emoji Dynamics Through Deep Learning;2024 11th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO);2024-03-14