Affiliation:
1. Tel Aviv University and University of Pennsylvania
2. Ben Gurion University and University of Pennsylvania
3. Tel Aviv University
4. University of Pennsylvania
Abstract
Provenance information has been proved to be very effective in capturing the computational process performed by queries, and has been used extensively as the input to many advanced data management tools (e.g., view maintenance, trust assessment, or query answering in probabilistic databases).
We observe here that while different (set-)equivalent queries may admit different provenance expressions when evaluated on the same database, there is always some part of these expressions that is common to all. We refer to this part as the
core
provenance. In addition to being informative, the core provenance is also useful as a compact input to the aforementioned data management tools. We formally define the notion of core provenance. We study algorithms that, given a query, compute an equivalent (called p-minimal) query that for every input database, the provenance of every result tuple is the core provenance. We study such algorithms for queries of varying expressive power (namely conjunctive queries with disequalities and unions thereof). Finally, we observe that, in general, one would not want to require database systems to execute a specific p-minimal query, but instead to be able to find, possibly off-line, the core provenance of a given tuple in the output (computed by an arbitrary equivalent query), without reevaluating the query. We provide algorithms for such direct computation of the core provenance.
Publisher
Association for Computing Machinery (ACM)
Cited by
10 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Minimally Factorizing the Provenance of Self-join Free Conjunctive Queries;Proceedings of the ACM on Management of Data;2024-05-10
2. Heuristic and Cost-Based Optimization for Diverse Provenance Tasks;IEEE Transactions on Knowledge and Data Engineering;2019-07-01
3. ProvCite;Proceedings of the VLDB Endowment;2019-03
4. Context-aware result inference in crowdsourcing;Information Sciences;2018-09
5. Fides;Proceedings of the 29th International Conference on Scientific and Statistical Database Management;2017-06-27