Abstract
AbstractRecognizing named entities (NEs) is commonly treated as a classification problem, and a class tag for a word or an NE candidate in a sentence is predicted. In recent neural network developments, deep structures that map categorized features into continuous representations have been adopted. Using this approach, a dense space saturated with high-order abstract semantic information is unfolded, and the prediction is based on distributed feature representations. In this paper, the positions of NEs in a sentence are represented as continuous values. Then, a regression operation is introduced to regress the boundaries of NEs in a sentence. Based on boundary regression, we design a boundary regression model to support nested NE recognition. It is a multiobjective learning framework that simultaneously predicts the classification score of an NE candidate and refines its spatial location in a sentence. This model was evaluated on the ACE 2005 Chinese and English corpus and the GENIA corpus. State-of-the-art performance was experimentally demonstrated for nested NE recognition, which outperforms related works about 5% and 2% respectively. Our model has the advantage to resolve nested NEs and support boundary regression for locating NEs in a sentence. By sharing parameters for predicting and locating, this model enables more potent nonlinear function approximators to enhance model discriminability.
Funder
Innovative Research Group Project of the National Natural Science Foundation of China
Publisher
Springer Science and Business Media LLC
Subject
Cognitive Neuroscience,Computer Science Applications,Computer Vision and Pattern Recognition
Reference61 articles.
1. McCallum A, Li W. Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In: Proceedings of the HLT-NAAC ’03. ACL; 2003. p. 188–91.
2. Hochreiter S, Schmidhuber Jürgen. Long short-term memory. Neural Comput. 1997;9(8):1735–80.
3. Doddington GR, Mitchell A, Przybocki MA, Ramshaw LA, Strassel S, Weischedel RM. The automatic content extraction (ACE) program-tasks, data, and evaluation. In: LREC, vol. 2. 2004.
4. Ohta T, Tateisi Y, Kim J-D. The GENIA corpus: an annotated research abstract corpus in molecular biology domain. In: Proceedings of the HLT ’02. Morgan Kaufmann Publishers Inc.; 2002. p. 82–86.
5. Sohrab MG, Miwa M. Deep exhaustive model for nested named entity recognition. In: Proceedings of the EMNLP ’18. 2018. p. 2843–49.
Cited by
10 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献