Affiliation:
1. Shandong Normal University, Jinan, China
2. Shandong Women’s University and Shandong Normal University, Jinan, China
3. Shandong Normal University and State Key Laboratory of High-End Server and Storage Technology, Jinan, China
Abstract
Chinese Named Entity Recognition (NER) is an essential task in natural language processing, and its performance directly impacts the downstream tasks. The main challenges in Chinese NER are the high dependence of named entities on context and the lack of word boundary information. Therefore, how to integrate relevant knowledge into the corresponding entity has become the primary task for Chinese NER. Both the lattice LSTM model and the WC-LSTM model did not make excellent use of contextual information. Additionally, the lattice LSTM model had a complex structure and did not exploit the word information well. To address the preceding problems, we propose a Chinese NER method based on the deep neural network with multiple ways of embedding fusion. First, we use a convolutional neural network to combine the contextual information of the input sequence and apply a self-attention mechanism to integrate lexicon knowledge, compensating for the lack of word boundaries. The word feature, context feature, bigram feature, and bigram context feature are obtained for each character. Second, four different features are used to fuse information at the embedding layer. As a result, four different word embeddings are obtained through cascading. Last, the fused feature information is input to the encoding and decoding layer. Experiments on three datasets show that our model can effectively improve the performance of Chinese NER.
Funder
Natural Science Foundation of Shangdong Province
Joint Funds for Smart Computing of the Natural Science Foundation of Shangdong Province
National Natural Science Foundation of China
Publisher
Association for Computing Machinery (ACM)