Affiliation:
1. Cornell University, Ithaca, NY
Abstract
Publishing data about individuals without revealing sensitive information about them is an important problem. In recent years, a new definition of privacy called
k
-anonymity has gained popularity. In a
k
-anonymized dataset, each record is indistinguishable from at least
k
− 1 other records with respect to certain identifying attributes.
In this article, we show using two simple attacks that a
k
-anonymized dataset has some subtle but severe privacy problems. First, an attacker can discover the values of sensitive attributes when there is little diversity in those sensitive attributes. This is a known problem. Second, attackers often have background knowledge, and we show that
k
-anonymity does not guarantee privacy against attackers using background knowledge. We give a detailed analysis of these two attacks, and we propose a novel and powerful privacy criterion called ℓ-diversity that can defend against such attacks. In addition to building a formal foundation for ℓ-diversity, we show in an experimental evaluation that ℓ-diversity is practical and can be implemented efficiently.
Publisher
Association for Computing Machinery (ACM)
Reference74 articles.
1. Security-control methods for statistical databases: a comparative study
2. Aggarwal G. Feder T. Kenthapadi K. Motwani R. Panigrahy R. Thomas D. and Zhu A. 2004. k-anonymity: Algorithms and hardness. Tech. rep. Stanford University. Aggarwal G. Feder T. Kenthapadi K. Motwani R. Panigrahy R. Thomas D. and Zhu A. 2004. k-anonymity: Algorithms and hardness. Tech. rep. Stanford University.
3. On the design and quantification of privacy preserving data mining algorithms
Cited by
1461 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献