Abstract
Microblogging websites such as Twitter have caused sentiment analysis research to increase in popularity over the last several decades. However, most studies focus on the English language, which leaves other languages underrepresented. Therefore, in this paper, we compare several modeling techniques for sentiment analysis using a new dataset containing Flemish tweets. The key contribution of our paper lies in its innovative experimental design: we compared different preprocessing techniques and vector representations to find the best-performing combination for a Flemish dataset. We compared models belonging to four different categories: lexicon-based methods, traditional machine-learning models, neural networks, and attention-based models. We found that more preprocessing leads to better results, but the best-performing vector representation approach depends on the model applied. Moreover, an immense gap was observed between the performances of the lexicon-based approaches and those of the other models. The traditional machine learning approaches and the neural networks produced similar results, but the attention-based model was the best-performing technique. Nevertheless, a tradeoff should be made between computational expenses and performance gains.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献