Affiliation:
1. Indiana University Bloomington, Berkeley, CA
2. Indiana University Bloomington, Bloomington, IN
Abstract
Data provenance, a form of metadata describing the life cycle of a data product, is crucial in the sharing of research data. Research data, when shared over decades, requires recipients to make a determination of both use and trust. That is, can they use the data? More importantly, can they trust it? Knowing the data are of high quality is one factor to establishing fitness for use and trust. Provenance can be used to assert the quality of the data, but the quality of the provenance must be known as well. We propose a framework for assessing the quality of data provenance. We identify quality issues in data provenance, establish key quality dimensions, and define a framework of analysis. We apply the analysis framework to synthetic and real-world provenance.
Funder
National Aeronautics and Space Administration
Publisher
Association for Computing Machinery (ACM)
Subject
Information Systems and Management,Information Systems
Reference36 articles.
1. Techniques for efficiently querying scientific workflow provenance graphs
2. Efficient provenance storage over nested data collections
3. The concept of relevance in IR
4. The continuum of metadata quality: Defining, expressing, exploiting, in D. I. Hillmann and E. L. Westbrooks, Eds., Metadata in Practice, ALA;Bruce Thomas;Chicago,2004
5. VisTrails
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献