Affiliation:
1. University of Crete and FORTH-ICS, Greece
2. FORTH-ICS, Greece
3. TEI of Serres and FORTH-ICS, Greece
Abstract
With the increasing use of Web 2.0 to create, disseminate, and consume large volumes of data, more and more information is published and becomes available for potential data consumers, that is, applications/services, individual users and communities, outside their production site. The most representative example of this trend is Linked Open Data (LOD), a set of interlinked data and knowledge bases. The main challenge in this context is data governance within loosely coordinated organizations that are publishing added-value interlinked data on the Web, bringing together issues related to data management and data quality, in order to support the full lifecycle of data production, consumption, and management. In this article, we are interested in curation issues for RDF(S) data, which is the default data model for LOD. In particular, we are addressing change management for RDF(S) data maintained by large communities (scientists, librarians, etc.) which act as curators to ensure high quality of data. Such curated Knowledge Bases (KBs) are constantly evolving for various reasons, such as the inclusion of new experimental evidence or observations, or the correction of erroneous conceptualizations. Managing such changes poses several research problems, including the problem of detecting the changes (delta) between versions of the same KB developed and maintained by different groups of curators, a crucial task for assisting them in understanding the involved changes. This becomes all the more important as curated KBs are interconnected (through copying or referencing) and thus changes need to be propagated from one KB to another either within or across communities. This article addresses this problem by proposing a change language which allows the formulation of
concise
and
intuitive
deltas. The language is expressive enough to
describe unambiguously
any possible change encountered in curated KBs expressed in RDF(S), and can be
efficiently and deterministically detected
in an automated way. Moreover, we devise a change detection
algorithm
which is sound and complete with respect to the aforementioned language, and study appropriate
semantics for executing the deltas
expressed in our language in order to move backwards and forwards in a multiversion repository, using only the corresponding deltas. Finally, we evaluate through experiments the effectiveness and efficiency of our algorithms using real ontologies from the cultural, bioinformatics, and entertainment domains.
Funder
Seventh Framework Programme
Publisher
Association for Computing Machinery (ACM)
Reference51 articles.
1. The universal protein resource (UniProt);Bairoch A.;Nucleic Acids Res.,2005
2. Semantics and implementation of schema evolution in object-oriented databases
Cited by
29 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献