Abstract
Text classification is one of the principal tasks of machine learning. It aims to design proper algorithms to enable computers to extract features and classify texts automatically. In the past, this has been mainly based on the classification of keywords and neural network semantic synthesis classification. The former emphasizes the role of keywords, while the latter focuses on the combination of words between roles. The method proposed in this paper considers the advantages of both methods. It uses an attention mechanism to learn weighting for each word. Under the setting, key words will have a higher weight, and common words will have lower weight. Therefore, the representation of texts not only considers all words, but also pays more attention to key words. Then we feed the feature vector to a softmax classifier. At last, we conduct experiments on two news classification datasets published by NLPCC2014 and Reuters, respectively. The proposed model achieves F-values by 88.5% and 51.8% on the two datasets. The experimental results show that our method outperforms all the traditional baseline systems.
Publisher
Agora University of Oradea
Subject
Computational Theory and Mathematics,Computer Networks and Communications,Computer Science Applications
Cited by
64 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献