Affiliation:
1. School of Automation Science and Electrical Engineering, Beihang University, No. 37 XueYuan Road, Haidian District, Beijing 100191, P. R. China
Abstract
Relation extraction (RE) is a crucial step for knowledge graph construction, which aims to extract meaningful relations between entity pairs in plain texts. Very few works have been studied on Chinese relation extraction (CRE) in the military field. Moreover, recent deep neural network-based methods have achieved considerable performance but still suffer from three inherent limitations, including overlapping of entities, imbalanced data and the ambiguity. Therefore, this work investigates a novel Multi-Grained Lattice Transformer (MGLT), which leverages external information of lexicon and word sense tailored for CRE. In MGLT, self-matched lexicon words and related word senses are fused through a cross-transformer mechanism to alleviate the ambiguity in texts. The finally enriched sequence representation in MGLT captures the relatedness between the head entity and the tail one, which is helpful to alleviate the overlapping of entities. Experimental results on two benchmark datasets and a self-developed dataset constructed from online military news show that the proposed MGLT achieves state-of-the-art (SOTA) performance. Compared with other typical baselines, MGLT achieves better area under curve (AUC) and [Formula: see text]-score by up to 10.46% and 6.90%, respectively. We further demonstrate the effectiveness of using ensemble learning to fully exploit complementary information from multiple MGLT-based base learners to improve the overall performance for imbalanced data classification on the military dataset. Such results indicate that the proposed ensemble learning model is effective and robust to be applied in practical applications.
Publisher
World Scientific Pub Co Pte Ltd
Subject
Computer Science Applications,Modeling and Simulation,General Engineering,General Mathematics