Affiliation:
1. Information Center, Jiangsu Academy of Agricultural Sciences & Institute of Science and Technology Information, Jiangsu University, China
2. Information Center, Jiangsu Academy of Agricultural Sciences, China
Abstract
Short text classification is a research focus for natural language processing (NLP), which is widely used in news classification, sentiment analysis, mail filtering and other fields. In recent years, deep learning techniques are applied to text classification and has made some progress. Different from ordinary text classification, short text has the problem of less vocabulary and feature sparsity, which raise higher request for text semantic feature representation. To address this issue, this paper propose a feature fusion framework based on the Bidirectional Encoder Representations from Transformers (BERT). In this hybrid method, BERT is used to train word vector representation. Convolutional neural network (CNN) capture static features. As a supplement, a bi-gated recurrent neural network (BiGRU) is adopted to capture contextual features. Furthermore, an attention mechanism is introduced to assign the weight of salient words. The experimental results confirmed that the proposed model significantly outperforms the other state-of-the-art baseline methods.
Subject
Strategy and Management,Computer Science Applications,Human-Computer Interaction
Reference37 articles.
1. KNN based machine learning approach for text and document mining.;V.Bijalwan;International Journal of Database Theory and Application,2014
2. Chen, Z. (2019). Short text classification based on word2vec and improved TDFIDF merge weighting. Paper presented at the 2019 3rd International Conference on Electronic Information Technology and Computer Engineering (EITCE).
3. Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078.
4. Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
5. Sentence Classification Using Novel NIN.;Y.-P.Fu;Journal of Computers,2018
Cited by
23 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献