Affiliation:
1. Center for Semantic Web Research, DCC, University of Chile, Santiago, Chile
Abstract
Existential blank nodes greatly complicate a number of fundamental operations on Resource Description Framework (RDF) graphs. In particular, the problems of determining if two RDF graphs have the same structure modulo blank node labels (i.e., if they are
isomorphic
), or determining if two RDF graphs have the same meaning under simple semantics (i.e., if they are
simple-equivalent
), have no known polynomial-time algorithms. In this article, we propose methods that can produce two canonical forms of an RDF graph. The first canonical form preserves isomorphism such that any two isomorphic RDF graphs will produce the same canonical form; this
iso-canonical
form is produced by modifying the well-known canonical labelling algorithm N
auty
for application to RDF graphs. The second canonical form additionally preserves simple-equivalence such that any two simple-equivalent RDF graphs will produce the same canonical form; this
equi-canonical
form is produced by, in a preliminary step, leaning the RDF graph, and then computing the iso-canonical form. These algorithms have a number of practical applications, such as for identifying isomorphic or equivalent RDF graphs in a large collection without requiring pairwise comparison, for computing checksums or signing RDF graphs, for applying consistent Skolemisation schemes where blank nodes are mapped in a canonical manner to Internationalised Resource Identifiers (IRIs), and so forth. Likewise a variety of algorithms can be simplified by presupposing RDF graphs in one of these canonical forms. Both algorithms require exponential steps in the worst case; in our evaluation we demonstrate that there indeed exist difficult synthetic cases, but we also provide results over 9.9 million RDF graphs that suggest such cases occur infrequently in the real world, and that both canonical forms can be efficiently computed in all but a handful of such cases.
Funder
Fondecyt
Millennium Nucleus Center for Semantic Web Research
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Networks and Communications
Reference54 articles.
1. Hashing and canonicalizing Notation 3 graphs
2. Random Graph Isomorphism
3. David Beckett Tim Berners-Lee Eric Prud’hommeaux and Gavin Carothers. 2014. RDF 1.1 Turtle -- Terse RDF Triple Language. W3C Recommendation. Retrieved from http://www.w3.org/TR/turtle/. David Beckett Tim Berners-Lee Eric Prud’hommeaux and Gavin Carothers. 2014. RDF 1.1 Turtle -- Terse RDF Triple Language. W3C Recommendation. Retrieved from http://www.w3.org/TR/turtle/.
Cited by
18 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献