Research on a Capsule Network Text Classification Method with a Self-Attention Mechanism
Author:
Yu Xiaodong1ORCID, Luo Shun-Nain1, Wu Yujia1ORCID, Cai Zhufei1, Kuan Ta-Wen1, Tseng Shih-Pang12
Affiliation:
1. School of Information Science and Technology, Sanda University, Shanghai 201209, China 2. Changzhou College of Information Technology, Changzhou 213164, China
Abstract
Convolutional neural networks (CNNs) need to replicate feature detectors when modeling spatial information, which reduces their efficiency. The number of replicated feature detectors or labeled training data required for such methods grows exponentially with the dimensionality of the data being used. On the other hand, space-insensitive methods are difficult to encode and express effectively due to the limitation of their rich text structures. In response to the above problems, this paper proposes a capsule network (self-attention capsule network, or SA-CapsNet) with a self-attention mechanism for text classification tasks, wherein the capsule network itself, given the feature with the symmetry hint on two ends, acts as both encoder and decoder. In order to learn long-distance dependent features in sentences and encode text information more efficiently, SA-CapsNet maps the self-attention module to the feature extraction layer of the capsule network, thereby increasing its feature extraction ability and overcoming the limitations of convolutional neural networks. In addition, in this study, in order to improve the accuracy of the model, the capsule was improved by reducing its dimension and an intermediate layer was added, enabling the model to obtain more expressive instantiation features in a given sentence. Finally, experiments were carried out on three general datasets of different sizes, namely the IMDB, MPQA, and MR datasets. The accuracy of the model on these three datasets was 84.72%, 80.31%, and 75.38%, respectively. Furthermore, compared with the benchmark algorithm, the model’s performance on these datasets was promising, with an increase in accuracy of 1.08%, 0.39%, and 1.43%, respectively. This study focused on reducing the parameters of the model for various applications, such as edge and mobile applications. The experimental results show that the accuracy is still not apparently decreased by the reduced parameters. The experimental results therefore verify the effective performance of the proposed SA-CapsNet model.
Reference49 articles.
1. Ashish, V., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., and Polosukhin, I. (2017, January 4–9). Attention is all you need. Proceedings of the Advances in Neural Information Processing Systems 30, Long Beach, CA, USA. 2. Chen, Q., Ling, Z.H., and Zhu, X. (2018). Enhancing sentence embedding with generalized pooling. arXiv. 3. Attention in natural language processing;Galassi;IEEE Trans. Neural Netw. Learn. Syst.,2020 4. Chen, C.W., Tseng, S.P., Kuan, T.W., and Wang, J.F. (2020). Outpatient text classification using attention-based bidirectional LSTM for robot-assisted servicing in hospital. Information, 11. 5. Symmetry at the Foundation of Science and Nature;Rosen;Symmetry,2009
|
|