Affiliation:
1. Harvard University, Cambridge, MA
2. The University of Vermont, New York, NY
3. Google, New York, NY
Abstract
Public-use earth science datasets are a useful resource with the unfortunate feature that their provenance is easily disconnected from their content. “Fair-use policies” typically associated with these datasets require appropriate attribution of providers by users, but sound and complete attribution is difficult if provenance information is lost. To address this, we introduce a technique to directly associate provenance information with sensor datasets. Our technique is similar to traditional watermarking but is intended for application to unstructured time-series datasets. Our approach is potentially imperceptible given sufficient margins of error in datasets and is robust to a number of benign but likely transformations including truncation, rounding, bit-flipping, sampling, and reordering. We provide algorithms for both one-bit and blind mark checking and show how our system can be adapted to various data representation types. Our algorithms are probabilistic in nature and are characterized by both combinatorial and empirical analyses. Mark embedding can be applied at any point in the data life cycle, allowing adaptation of our scheme to social or scientific concerns.
Publisher
Association for Computing Machinery (ACM)
Subject
Information Systems and Management,Information Systems
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献