Data synthesis via differentially private markov random fields-Reference-Cited by-同舟云学术

Data synthesis via differentially private markov random fields

Published:2021-07 Issue:11 Volume:14 Page:2190-2202
ISSN:2150-8097
Container-title:Proceedings of the VLDB Endowment
language:en
Short-container-title:Proc. VLDB Endow.

Author:

Cai Kuntai¹,Lei Xiaoyu²,Wei Jianxin¹,Xiao Xiaokui¹

Affiliation:

1. National University of Singapore

2. University of Connecticut

Abstract

This paper studies the synthesis of high-dimensional datasets with differential privacy (DP). The state-of-the-art solution addresses this problem by first generating a set M of noisy low-dimensional marginals of the input data D , and then use them to approximate the data distribution in D for synthetic data generation. However, it imposes several constraints on M that considerably limits the choices of marginals. This makes it difficult to capture all important correlations among attributes, which in turn degrades the quality of the resulting synthetic data. To address the above deficiency, we propose PrivMRF, a method that (i) also utilizes a set M of low-dimensional marginals for synthesizing high-dimensional data with DP, but (ii) provides a high degree of flexibility in the choices of marginals. The key idea of PrivMRF is to select an appropriate M to construct a Markov random field (MRF) that models the correlations among the attributes in the input data, and then use the MRF for data synthesis. Experimental results on four benchmark datasets show that PrivMRF consistently outperforms the state of the art in terms of the accuracy of counting queries and classification tasks conducted on the synthetic data generated.

Publisher

VLDB Endowment

Subject

General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development

Link

https://dl.acm.org/doi/pdf/10.14778/3476249.3476272

Cited by 26 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. VertiMRF: Differentially Private Vertical Federated Data Synthesis;Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining;2024-08-24

2. Does Differentially Private Synthetic Data Lead to Synthetic Discoveries?;Methods of Information in Medicine;2024-08-13

3. SoK: Privacy-Preserving Data Synthesis;2024 IEEE Symposium on Security and Privacy (SP);2024-05-19

4. Epistemic Parity: Reproducibility as an Evaluation Metric for Differential Privacy;ACM SIGMOD Record;2024-05-14

5. 30 Years of Synthetic Data;Statistical Science;2024-05-01