Affiliation:
1. United States Naval Academy .
2. University of Maryland , Colleage Park .
3. University of Massachusetts Amherst .
4. George Washington University .
Abstract
Abstract
We consider a scenario where multiple organizations holding large amounts of sensitive data from their users wish to compute aggregate statistics on this data while protecting the privacy of individual users. To support large-scale analytics we investigate how this privacy can be provided for the case of sketching algorithms running in time sub-linear of the input size.
We begin with the well-known LogLog sketch for computing the number of unique elements in a data stream. We show that this algorithm already achieves differential privacy (even without adding any noise) when computed using a private hash function by a trusted curator. Next, we show how to eliminate this requirement of a private hash function by injecting a small amount of noise, allowing us to instantiate an efficient LogLog protocol for the multi-party setting. To demonstrate the practicality of this approach, we run extensive experimentation on multiple data sets, including the publicly available IP address data set from University of Michigan’s scans of internet IPv4 space, to determine the trade-offs among efficiency, privacy and accuracy of our implementation for varying numbers of parties and input sizes.
Finally, we generalize our approach for the LogLog sketch and obtain a general framework for constructing multi-party differentially private protocols for several other sketching algorithms.
Reference70 articles.
1. [1] Gail-Joon Ahn, Moti Yung, and Ninghui Li, editors. ACM CCS 14. ACM Press, November 2014.
2. [2] Kook Jin Ahn, Sudipto Guha, and Andrew McGregor. Analyzing graph structure via linear measurements. In Yuval Rabani, editor, 23rd SODA, pages 459–467. ACM-SIAM, January 2012.
3. [3] Nir Ailon and Bernard Chazelle. The fast johnson– lindenstrauss transform and approximate nearest neighbors. SIAM Journal on computing, 39(1):302–322, 2009.
4. [4] Mohammad Alaggan, Mathieu Cunche, and Sébastien Gambs. Privacy-preserving wi-fi analytics. Proceedings on Privacy Enhancing Technologies, 2018(2):4–26, 2018.
5. [5] Noga Alon, Yossi Matias, and Mario Szegedy. The space complexity of approximating the frequency moments. J. Comput. Syst. Sci., 58(1):137–147, 1999.
Cited by
10 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献