Affiliation:
1. Insight Data Science Lab, Federal University of Ceará, Ceará, Brazil
2. Institute of Information Science and Technologies, National Research Council, Pisa, Italy
Abstract
Named Entity Recognition (NER) is a challenging learning task of identifying and classifying entity mentions in texts into predefined categories. In recent years, deep learning (DL) methods empowered by distributed representations, such as word- and character-level embeddings, have been employed in NER systems. However, for information extraction in Police narrative reports, the performance of a DL-based NER approach is limited due to the presence of fine-grained ambiguous entities. For example, given the narrative report “Anna stole Ada’s car”, imagine that we intend to identify the VICTIM and the ROBBER, two sub-labels of PERSON. Traditional NER systems have limited performance in categorizing entity labels arranged in a hierarchical structure. Furthermore, it is unfeasible to obtain information from knowledge bases to give a disambiguated meaning between the entity mentions and the actual labels. This information must be extracted directly from the context dependencies. In this paper, we deal with the Hierarchical Entity-Label Disambiguation problem in Police reports without the use of knowledge bases. To tackle such a problem, we present HELD, an ensemble model that combines two components for NER: a BLSTM-CRF architecture and a NER tool. Experiments conducted on a real Police reports dataset show that HELD significantly outperforms baseline approaches.
Subject
Artificial Intelligence,Computer Vision and Pattern Recognition,Theoretical Computer Science
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献