A Histogram Publishing Method under Differential Privacy That Involves Balancing Small-Bin Availability First

Author:

Chen Jianzhang1,Zhou Shuo1,Qiu Jie1,Xu Yixin1,Zeng Bozhe1,Fang Wanchuan1,Chen Xiangying1,Huang Yipeng1,Xu Zhengquan2,Chen Youqin23

Affiliation:

1. College of Computer and Information Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China

2. State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China

3. College of Computer Science and Mathematics, Fujian University of Technology, Fuzhou 350118, China

Abstract

Differential privacy, a cornerstone of privacy-preserving techniques, plays an indispensable role in ensuring the secure handling and sharing of sensitive data analysis across domains such as in census, healthcare, and social networks. Histograms, serving as a visually compelling tool for presenting analytical outcomes, are widely employed in these sectors. Currently, numerous algorithms for publishing histograms under differential privacy have been developed, striving to balance privacy protection with the provision of useful data. Nonetheless, the pivotal challenge concerning the effective enhancement of precision for small bins (those intervals that are narrowly defined or contain a relatively small number of data points) within histograms has yet to receive adequate attention and in-depth investigation from experts. In standard DP histogram publishing, adding noise without regard for bin size can result in small data bins being disproportionately influenced by noise, potentially severely impairing the overall accuracy of the histogram. In response to this challenge, this paper introduces the SReB_GCA sanitization algorithm designed to enhance the accuracy of small bins in DP histograms. The SReB_GCA approach involves sorting the bins from smallest to largest and applying a greedy grouping strategy, with a predefined lower bound on the mean relative error required for a bin to be included in a group. Our theoretical analysis reveals that sorting bins in ascending order prior to grouping effectively prioritizes the accuracy of smaller bins. SReB_GCA ensures strict ϵ-DP compliance and strikes a careful balance between reconstruction error and noise error, thereby not only initially improving the accuracy of small bins but also approximately optimizing the mean relative error of the entire histogram. To validate the efficiency of our proposed SReB_GCA method, we conducted extensive experiments using four diverse datasets, including two real-life datasets and two synthetic ones. The experimental results, quantified by the Kullback–Leibler Divergence (KLD), show that the SReB_GCA algorithm achieves substantial performance enhancement compared to the baseline method (DP_BASE) and several other established approaches for differential privacy histogram publication.

Funder

Natural Science Foundation of China

Natural Science Foundation of Fujian Province, China

Science and Technology Innovation Special Fund of Fujian Agriculture and Forestry University

Undergraduate Innovation and Entrepreneurship Training Program of Fujian Agricultural and Forestry University

Research Fund of Fujian University of Technology

Publisher

MDPI AG

Reference36 articles.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3