Author:
Pittaras N.,Giannakopoulos G.,Papadakis G.,Karkaletsis V.
Abstract
AbstractThe recent breakthroughs in deep neural architectures across multiple machine learning fields have led to the widespread use of deep neural models. These learners are often applied as black-box models that ignore or insufficiently utilize a wealth of preexisting semantic information. In this study, we focus on the text classification task, investigating methods for augmenting the input to deep neural networks (DNNs) with semantic information. We extract semantics for the words in the preprocessed text from the WordNet semantic graph, in the form of weighted concept terms that form a semantic frequency vector. Concepts are selected via a variety of semantic disambiguation techniques, including a basic, a part-of-speech-based, and a semantic embedding projection method. Additionally, we consider a weight propagation mechanism that exploits semantic relationships in the concept graph and conveys a spreading activation component. We enrich word2vec embeddings with the resulting semantic vector through concatenation or replacement and apply the semantically augmented word embeddings on the classification task via a DNN. Experimental results over established datasets demonstrate that our approach of semantic augmentation in the input space boosts classification performance significantly, with concatenation offering the best performance. We also note additional interesting findings produced by our approach regarding the behavior of term frequency - inverse document frequency normalization on semantic vectors, along with the radical dimensionality reduction potential with negligible performance loss.
Publisher
Cambridge University Press (CUP)
Subject
Artificial Intelligence,Linguistics and Language,Language and Linguistics,Software
Reference67 articles.
1. Knowledge Base Unification via Sense Embeddings and Disambiguation
2. A neural probabilistic language model;Bengio;Journal of Machine Learning Research,2003
3. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers
4. Estimation of probabilities from sparse data for the language model component of a speech recognizer
5. Ganitkevitch, J. , Van Durme, B. and Callison-Burch, C. (2013). Ppdb: The paraphrase database. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. The Association for Computational Linguistics, pp. 758–764.
Cited by
17 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献