Affiliation:
1. Data Mining Lab, School of Computer Engineering, Iran University of Science and Technology, Tehran, Iran
Abstract
Relation extraction is the task of extracting semantic relations between entities in a sentence. It is an essential part of some natural language processing tasks such as information extraction, knowledge extraction, question answering, and knowledge base population. The main motivations of this research stem from a lack of a dataset for relation extraction in the Persian language as well as the necessity of extracting knowledge from the growing big data in the Persian language for different applications. In this paper, we present “PERLEX” as the first Persian dataset for relation extraction, which is an expert-translated version of the “SemEval-2010-Task-8” dataset. Moreover, this paper addresses Persian relation extraction utilizing state-of-the-art language-agnostic algorithms. We employ six different models for relation extraction on the proposed bilingual dataset, including a non-neural model (as the baseline), three neural models, and two deep learning models fed by multilingual BERT contextual word representations. The experiments result in the maximum F1-score of 77.66% (provided by BERTEM-MTB method) as the state of the art of relation extraction in the Persian language.
Subject
Computer Science Applications,Software
Reference39 articles.
1. Yago: a core of semantic knowledge;F. M. Suchanek
2. Freebase: a collaboratively created graph database for structuring human knowledge;K. Bollacker
3. Dbpedia: a nucleus for a web of open data;S. Auer
4. Wikidata: a free collaborative knowledgebase;D. Vrandečić;Communications of the ACM,2014
5. FarsBase: The Persian knowledge graph
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. FarCQA: A Farsi Community Dataset for Question Classification and Answer Selection;2023 13th International Conference on Computer and Knowledge Engineering (ICCKE);2023-11-01