Affiliation:
1. School of Computer and Science, Hubei University of Technology, Wuhan (086)430068, China
2. School of Information and Safety Engineering, Zhongnan University of Economics and Law, Wuhan (086)430073, China
Abstract
With the growing popularity of traditional Chinese medicine (TCM) in the world and the increasing awareness of intellectual property protection, the number of TCM patent application is growing year by year. TCM patents contain rich medical, legal, and economic information. Effective text mining of TCM patents is of great theoretical and practical significance (e.g., the R&D of new medicines, patent infringement litigation, and patent acquisition). Named entity recognition (NER) is a fundamental task in natural language processing and a crucial step before indepth analysis of TCM patent. In this paper, a method combining Bidirectional Long Short-Term Memory neural network with Conditional Random Field (BiLSTM-CRF) is proposed to automatically recognize entities of interest (i.e., herb names, disease names, symptoms, and therapeutic effects) from the abstract texts of TCM patents. By virtue of the capabilities of deep learning methods, the semantic information in the context can be learned without feature engineering. Experiments show that the BiLSTM-CRF-based method provides superior performance in comparison with various baseline methods.
Funder
Fundamental Research Funds for the Central Universities
Subject
Electrical and Electronic Engineering,Computer Networks and Communications,Information Systems
Cited by
24 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献