Enhancing text classification with attention matrices based on BERT-Reference-Cited by-同舟云学术

Enhancing text classification with attention matrices based on BERT

Published:2023-11-22 Issue: Volume: Page:
ISSN:0266-4720
Container-title:Expert Systems
language:en
Short-container-title:Expert Systems

Author:

Yu Zhiyi¹^ORCID,Li Hong¹,Feng Jialin¹

Affiliation:

1. School of Computer Science and Engineering Central South University Changsha China

Abstract

SummaryText classification is a critical task in the field of natural language processing. While pre‐trained language models like BERT have made significant strides in improving performance in this area, the distinctive dependency information that is present in text has not been fully exploited. Besides, BERT mostly captures phrase‐level information in lower layers, which becomes progressively weaker with the increasing depth of layers. To address these limitations, our work focuses on enhancing text classification through the incorporation of Attention Matrices, particularly in the fine‐tuning process of pre‐trained models like BERT. Our approach, named AM‐BERT, leverages learned dependency relationships as external knowledge to enhance the pre‐trained model by generating attention matrices. In addition, we introduce a new learning strategy that enables the model to retain learned phrase‐level structure information. Extensive experiments and detailed analysis on multiple benchmark datasets demonstrate the effectiveness of our approach in text classification tasks. Furthermore, we show that AM‐BERT achieves stable performance improvements also in named entity recognition tasks.

Funder

National Natural Science Foundation of China

Publisher

Wiley

Subject

Artificial Intelligence,Computational Theory and Mathematics,Theoretical Computer Science,Control and Systems Engineering

Link

https://onlinelibrary.wiley.com/doi/pdf/10.1111/exsy.13512

Reference46 articles.

1. Bow‐based neural networks vs. cutting‐edge models for single‐label text classification;Abdalla H. I.;Neural Computing and Applications,2023

2. A multi‐semantic passing framework for semi‐supervised long text classification;Ai W.;Applied Intelligence,2023

3. Neural Probabilistic Language Models

4. Chen Y.‐C. &Bansal M.(2018).Fast abstractive summarization with reinforce‐selected sentence rewriting. arXiv preprint arXiv:1805.11080https://doi.org/10.18653/v1/P18-1063

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Exploring transformer models for sentiment classification: A comparison of BERT, RoBERTa, ALBERT, DistilBERT, and XLNet;Expert Systems;2024-08-14

2. Research on Library Resource Management Based on Modern Information Technology and Reconfigurable Mobile Information System;Journal of Cases on Information Technology;2024-07-24

3. MEDOBOT: Hospital Reception Bot for Multi-Speciality Hospitals;2024 International Conference on Advances in Computing, Communication and Applied Informatics (ACCAI);2024-05-09

4. Readability Grading Based on Multidimensional Linguistics Features for International Chinese Language Education;IEEE Access;2024