Affiliation:
1. College of Information and Computer, Taiyuan University of Technology, Taiyuan, Shanxi 030000, China
Abstract
The Internet is rich in information related to the financial field. The financial entity information text containing new internet vocabulary has a certain impact on the results of existing recognition algorithms. How to solve the problems of new vocabulary and polysemy is a problem to be solved in the current field. This paper proposes an ERNIE-Doc-BiLSTM-CRF named entity recognition model based on the pretrained language model. Compared with the traditional model, the ERNIE-Doc pretrained language model constructs a unique word vector from the word vector and combines the location coding, which solves polysemy problem well. The intensive skimming mechanism realizes the long text processing well and captures the context information effectively. The experimental results show that the accuracy of this model is 86.72%, the recall rate is 83.39%, and the F1 value is 85.02%, which is 13.36% higher than other models; the recall rate is increased by 13.05%, and the F1 value is increased by 13.21%.
Funder
Key R&D Projects in Shanxi Province
Subject
General Mathematics,General Medicine,General Neuroscience,General Computer Science
Reference18 articles.
1. Extracting company names from text
2. A Decision Tree Method for Finding and Classifying Names in Japanese texts;S. Sekine
3. Clinical named entity recognition:ECUST in the CCKS-2017 shared task 2;Y. H. Xia;CEUR Workshop Proceedings,2017
4. Flytxt_NTNU at SemEval-2018 Task 8: Identifying and Classifying
Malware Text Using Conditional Random Fields and Naïve Bayes
Classifiers
5. Named entity recognition in Chinese electronic medical records based on BERT;L. Li;Journal of Inner Mongolia University of Science and Technology,2020
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献