Author:
CHENEY JAMES,AHMED AMAL,ACAR UMUT A.
Abstract
Provenance is information recording the source, derivation or history of some information. Provenance tracking has been studied in a variety of settings, particularly database management systems. However, although many candidate definitions of provenance have been proposed, the mathematical or semantic foundations of data provenance have received comparatively little attention. In this paper, we argue that dependency analysis techniques familiar from program analysis and program slicing provide a formal foundation for forms of provenance that are intended to show how (part of) the output of a query depends on (parts of) its input. We introduce a semantic characterisation of such dependency provenance for a core database query language, show that minimal dependency provenance is not computable, and provide dynamic and static approximation techniques. We also discuss preliminary implementation experience with using dependency provenance to compute data slices, or summaries of the parts of the input relevant to a given part of the output.
Publisher
Cambridge University Press (CUP)
Subject
Computer Science Applications,Mathematics (miscellaneous)
Cited by
31 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Gaining trust by tracing security protocols;Journal of Logical and Algebraic Methods in Programming;2023-01
2. FORSETI: A visual analysis environment enabling provenance awareness for the accountability of e-autopsy reports;Visual Informatics;2022-09
3. The Systematic Design of Responsibility Analysis by Abstract Interpretation;ACM Transactions on Programming Languages and Systems;2022-03-31
4. Ownership at Large;Proceedings of the 28th International Conference on Program Comprehension;2020-07-13
5. Improving data scientist efficiency with provenance;Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering;2020-06-27