Affiliation:
1. Beihang University, China
2. Shenzhen Institute of Computing Sciences, China and University of Edinburgh, United Kingdom and Beihang University, China
3. Shenzhen Institute of Computing Sciences, China
Abstract
This paper studies a new problem of relation enrichment. Given a relation
D
of schema
R
and a knowledge graph
G
with overlapping information, it is to identify a small number of relevant features from
G
, and extend schema
R
with the additional attributes, to maximally improve the accuracy of resolving entities represented by the tuples of
D.
We formulate the enrichment problem and show its intractability. Nonetheless, we propose a method to extract features from
G
that are diverse from the existing attributes of
R
, minimize null values, and moreover, reduce false positives and false negatives of entity resolution (ER) models. The method links tuples and vertices that refer to the same entity, learns a robust policy to extract attributes via reinforcement learning, and jointly trains the policy and ER models. Moreover, we develop algorithms for (incrementally) enriching
D.
Using real-life data, we experimentally verify that relation enrichment improves the accuracy of ER above 15.4% (percentage points) by adding 5 attributes, up to 33%.
Publisher
Association for Computing Machinery (ACM)
Reference139 articles.
1. 2017. Identity fraud's impact on the insurance sector. https://legal.thomsonreuters.com/en/insights/articles/identity-frauds-impact-on-the-insurance-sector.
2. 2019. IMDB. https://www.imdb.com/interfaces/.
3. 2020. Knowledge Graphs for Financial Services. https://www2.deloitte.com/content/dam/Deloitte/nl/Documents/risk/deloitte-nl-risk-knowledge-graphs-financial-services.pdf.
4. 2022. DBpedia. http://wiki.dbpedia.org.
5. 2022. Fraud detection using knowledge graph: How to detect and visualize fraudulent activities. https://www.nebula-graph.io/posts/fraud-detection-using-knowledge-and-graph-database.