Affiliation:
1. Indiana University, Bloomington, IN
Abstract
Data management is growing in complexity as large-scale applications take advantage of the loosely coupled resources brought together by grid middleware and by abundant storage capacity. Metadata describing the data products used in and generated by these applications is essential to disambiguate the data and enable reuse. Data provenance, one kind of metadata, pertains to the derivation history of a data product starting from its original sources.In this paper we create a taxonomy of data provenance characteristics and apply it to current research efforts in e-science, focusing primarily on scientific workflow approaches. The main aspect of our taxonomy categorizes provenance systems based on why they record provenance, what they describe, how they represent and store provenance, and ways to disseminate it. The survey culminates with an identification of open research problems in the field.
Publisher
Association for Computing Machinery (ACM)
Subject
Information Systems,Software
Reference27 articles.
1. J. Brase "Using Digital Library Techniques - Registration of Scientific Primary Data " in ECDL 2004.]] J. Brase "Using Digital Library Techniques - Registration of Scientific Primary Data " in ECDL 2004.]]
2. J. L. Romeu "Data Quality and Pedigree " in Material Ease 1999.]] J. L. Romeu "Data Quality and Pedigree " in Material Ease 1999.]]
3. Database management for life sciences research
4. "Access to genetic resources and Benefit-Sharing (ABS) Program " United Nations University 2003.]] "Access to genetic resources and Benefit-Sharing (ABS) Program " United Nations University 2003.]]
Cited by
643 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献