Affiliation:
1. Cultural Technology Department, University of the Aegean, University Hill, Mytilene 81100, Greece
2. Palo Services, 9, Chavriou Street, Athens 10562, Greece
Abstract
The task of sentiment analysis tries to predict the affective state of a document by examining its content and metadata through the application of machine learning techniques. Recent advances in the field consider sentiment to be a multi-dimensional quantity that pertains to different interpretations (or aspects), rather than a single one. Based on earlier research, the current work examines the said task in the framework of a larger architecture that crawls documents from various online sources. Subsequently, the collected data are pre-processed, in order to extract useful features that assist the machine learning algorithms in the sentiment analysis task. More specifically, the words that comprise each text are mapped to a neural embedding space and are provided to a hybrid, bi-directional long short-term memory network, coupled with convolutional layers and an attention mechanism that outputs the final textual features. Additionally, a number of document metadata are extracted, including the number of a document’s repetitions in the collected corpus (i.e. number of reposts/retweets), the frequency and type of emoji ideograms and the presence of keywords, either extracted automatically or assigned manually, in the form of hashtags. The novelty of the proposed approach lies in the semantic annotation of the retrieved keywords, since an ontology-based knowledge management system is queried, with the purpose of retrieving the classes the aforementioned keywords belong to. Finally, all features are provided to a fully connected, multi-layered, feed-forward artificial neural network that performs the analysis task. The overall architecture is compared, on a manually collected corpus of documents, with two other state-of-the-art approaches, achieving optimal results in identifying negative sentiment, which is of particular interest to certain parties (like for example, companies) that are interested in measuring their online reputation.
Publisher
World Scientific Pub Co Pte Ltd
Subject
Computer Networks and Communications,General Medicine
Cited by
18 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献