Affiliation:
1. Technical University of Munich , Germany
Abstract
Abstract
Methods for privacy-preserving data publishing and analysis trade off privacy risks for individuals against the quality of output data. In this article, we present a data publishing algorithm that satisfies the differential privacy model. The transformations performed are truthful, which means that the algorithm does not perturb input data or generate synthetic output data. Instead, records are randomly drawn from the input dataset and the uniqueness of their features is reduced. This also offers an intuitive notion of privacy protection. Moreover, the approach is generic, as it can be parameterized with different objective functions to optimize its output towards different applications. We show this by integrating six well-known data quality models. We present an extensive analytical and experimental evaluation and a comparison with prior work. The results show that our algorithm is the first practical implementation of the described approach and that it can be used with reasonable privacy parameters resulting in high degrees of protection. Moreover, when parameterizing the generic method with an objective function quantifying the suitability of data for building statistical classifiers, we measured prediction accuracies that compare very well with results obtained using state-of-the-art differentially private classification algorithms.
Reference57 articles.
1. [1] A. Machanavajjhala et al. l-diversity: Privacy beyond kanonymity. Transactions on Knowledge Discovery from Data, 1(1):3, 2007.
2. [2] B. C. M. Fung et al. Introduction to Privacy-Preserving Data Publishing: Concepts and Techniques. CRC Press, 2010.
3. [3] R. J. Bayardo and R. Agrawal. Data privacy through optimal k-anonymization. In International Conference on Data Engineering, pages 217–228, 2005.
4. [4] J. Brickell and V. Shmatikov. The cost of privacy: Destruction of data-mining utility in anonymized data publishing. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 70–78, 2008.
5. [5] C. Clifton and T. Tassa. On syntactic anonymity and differential privacy. In International Conference on Data Engineering Workshops, pages 88–93, 2013.
Cited by
39 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献