Affiliation:
1. School of Computer Science and Engineering, Northeastern University, Shenyang 110169, China
Abstract
Regarding the existing models for feature extraction of complex similar entities, there are problems in the utilization of relative position information and the ability of key feature extraction. The distinctiveness of Chinese named entity recognition compared to English lies in the absence of space delimiters, significant polysemy and homonymy of characters, diverse and common names, and a greater reliance on complex contextual and linguistic structures. An entity recognition method based on DeBERTa-Attention-BiLSTM-CRF (DABC) is proposed. Firstly, the feature extraction capability of the DeBERTa model is utilized to extract the data features; then, the attention mechanism is introduced to further enhance the extracted features; finally, BiLSTM is utilized to further capture the long-distance dependencies in the text and obtain the predicted sequences through the CRF layer, and then the entities in the text are identified. The proposed model is applied to the dataset for validation. The experiments show that the precision (P) of the proposed DABC model on the dataset reaches 88.167%, the recall (R) reaches 83.121%, and the F1 value reaches 85.024%. Compared with other models, the F1 value improves by 3∼5%, and the superiority of the model is verified. In the future, it can be extended and applied to recognize complex entities in more fields.
Funder
National Natural Science Foundation of China
Reference40 articles.
1. Liang, J., Li, D., Lin, Y., Wu, S., and Huang, Z. (2023). Named entity recognition of Chinese crop diseases and pests based on RoBERTa-wwm with adversarial training. Agronomy, 13.
2. Scideberta: Learning deberta for science technology documents and fine-tuning information extraction tasks;Jeong;IEEE Access,2022
3. Distributed representations of words and phrases and their compositionality;Mikolov;Adv. Neural Inf. Process. Syst.,2013
4. Ontology-based semantic retrieval of documents using Word2vec model;Sharma;Data Knowl. Eng.,2023
5. Zhao, X., Greenberg, J., An, Y., and Hu, X.T. (2021, January 15–18). Fine-tuning BERT model for materials named entity recognition. Proceedings of the 2021 IEEE International Conference on Big Data (Big Data), Orlando, FL, USA.