Author:
Huang Subin,Xiu Yu,Li Jun,Liu Sanmin,Kong Chao
Abstract
AbstractEntity synonyms play a significant role in entity-based tasks. Previous approaches use linguistic syntax, distributional, and semantic features to expand entity synonym sets from text corpora. Due to the flexibility and complexity of the Chinese language expression, the aforementioned approaches are still difficult to expand entity synonym sets robustly from Chinese text, because these approaches fail to track holistic semantics among entities and suffer from error propagation. This paper introduces an approach for expanding Chinese entity synonym sets based on bilateral context and filtering strategy. Specifically, the approach consists of two novel components. First, a bilateral-context-based Siamese network classifier is proposed to determine whether a new entity should be inserted into the existing entity synonym set. The classifier tracks the holistic semantics of bilateral contexts and is capable of imposing soft holistic semantic constraints to improve synonym prediction. Second, a filtering-strategy-based set expansion algorithm is presented to generate Chinese entity synonym sets. The filtering strategy enhances semantic and domain consistencies to filter out wrong synonym entities, thereby mitigating error propagation. Experimental results on two Chinese real-world datasets demonstrate that the proposed approach is effective and outperforms the selected existing state-of-the-art approaches to the Chinese entity synonym set expansion task.
Funder
Excellent Young Talents Fund Program of Higher Education Institutions of Anhui Province
University Natural Science Research Projects of Anhui Province
Publisher
Springer Science and Business Media LLC
Subject
Computational Mathematics,Engineering (miscellaneous),Information Systems,Artificial Intelligence
Reference46 articles.
1. Mahdisoltani F, Biega J, Suchanek FM (2015) YAGO3: a knowledge base from multilingual wikipedias. In: Seventh biennial conference on innovative data systems research, CIDR 2015, Asilomar, CA, USA, January 4–7, 2015
2. Xu B, Xu Y, Liang J, Xie C, Liang B, Cui W, Xiao Y (2017) Cn-dbpedia: a never-ending Chinese knowledge extraction system. In: Advances in artificial intelligence: from theory to practice—30th international conference on industrial engineering and other applications of applied intelligent systems. IEA/AIE 2017, Arras, France, June 27–30, part II, vol 10351, pp 428–438
3. Qi F, Chang L, Sun M, Ouyang S, Liu Z (2020) Towards building a multilingual sememe knowledge base: Predicting sememes for BabelNet synsets. In: The thirty-fourth AAAI conference on artificial intelligence, AAAI 2020, the thirty-second innovative applications of artificial intelligence conference, IAAI 2020, the tenth AAAI symposium on educational advances in artificial intelligence, EAAI 2020, New York, NY, USA, February 7–12, pp 8624–8631
4. Rios-Alvarado AB, Martinez-Rodriguez JL, Garcia-Perez AG, Guerrero-Melendez TY, Lopez-Arevalo I, Gonzalez-Compean JL (2022) Exploiting lexical patterns for knowledge graph construction from unstructured text in Spanish. Complex Intell Syst 9:1281–1297
5. Gupta A, Lebret R, Harkous H, Aberer K (2017) Taxonomy induction using hypernym subsequences. In: Proceedings of the 2017 ACM on conference on information and knowledge management. CIKM 2017, Singapore, November 06–10, pp 1329–1338
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献