Affiliation:
1. Department of Mathematics and Statistics Vassar College New York USA
2. Labor, Human Services, and Population Urban Institute Washington D.C. USA
Abstract
AbstractSynthetic data generation is a powerful tool for privacy protection when considering public release of record‐level data files. Initially proposed about three decades ago, it has generated significant research and application interest. To meet the pressing demand of data privacy protection in a variety of contexts, the field needs more researchers and practitioners. This review provides a comprehensive introduction to synthetic data, including technical details of their generation and evaluation. Our review also addresses the challenges and limitations of synthetic data, discusses practical applications, and provides thoughts for future work.This article is categorized under:
Statistical and Graphical Methods of Data Analysis > Modeling Methods and Algorithms
Funder
Alfred P. Sloan Foundation
Subject
Statistics and Probability