Affiliation:
1. LIPADE, Université de Paris & IMBA Consulting
2. LIPADE, Université de Paris
3. Mohammed VI Polytechnic University
4. LIPADE, Université de Paris and French University Institute IUF
5. IMBA Consulting
Abstract
In this paper, we present a comprehensive study that evaluates six state-of-the-art sentiment analysis tools on five public datasets, based on the quality of predictive results in the presence of semantically equivalent documents, i.e., how consistent existing tools are in predicting the polarity of documents based on paraphrased text. We observe that sentiment analysis tools exhibit
intra-tool inconsistency
, which is the prediction of different polarity for semantically equivalent documents by the same tool, and
inter-tool inconsistency
, which is the prediction of different polarity for semantically equivalent documents across different tools. We introduce a heuristic to assess the data quality of an augmented dataset and a new set of metrics to evaluate tool inconsistencies. Our results indicate that tool inconsistencies is still an open problem, and they point towards promising research directions and accuracy improvements that can be obtained if such inconsistencies are resolved.
Subject
General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献