Affiliation:
1. Jiangsu Key Lab of Big Data Security & Intelligent Processing, School of Computer Science, Nanjing University of Posts and Telecommunications, Nanijing 210023, China
2. Zhejiang Engineering Research Center of Intelligent Medicine, Wenzhou 325035, China
Abstract
Background. Clinical named entity recognition is the basic task of mining electronic medical records text, which are with some challenges containing the language features of Chinese electronic medical records text with many compound entities, serious missing sentence components, and unclear entity boundary. Moreover, the corpus of Chinese electronic medical records is difficult to obtain. Methods. Aiming at these characteristics of Chinese electronic medical records, this study proposed a Chinese clinical entity recognition model based on deep learning pretraining. The model used word embedding from domain corpus and fine-tuning of entity recognition model pretrained by relevant corpus. Then BiLSTM and Transformer are, respectively, used as feature extractors to identify four types of clinical entities including diseases, symptoms, drugs, and operations from the text of Chinese electronic medical records. Results. 75.06% Macro-P, 76.40% Macro-R, and 75.72% Macro-F1 aiming at test dataset could be achieved. These experiments show that the Chinese clinical entity recognition model based on deep learning pretraining can effectively improve the recognition effect. Conclusions. These experiments show that the proposed Chinese clinical entity recognition model based on deep learning pretraining can effectively improve the recognition performance.
Funder
National Natural Science Foundation of China
Subject
Health Informatics,Biomedical Engineering,Surgery,Biotechnology
Reference22 articles.
1. ZhangL. B.Word Segmentation and Named Entity Mining Based on Semi Supervised Learning for Chinese EMR2014Harbin, ChinaHarbin Institute of TechnologyDissertation
2. 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text
3. A Guide to Dictionary-Based Text Mining
4. Recognizing named entities in tweets;X. Liu
Cited by
25 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献