Abstract
The application of a currently proposed differential privacy algorithm to the 2020 United States Census data and additional data products may affect the usefulness of these data, the accuracy of estimates and rates derived from them, and critical knowledge about social phenomena such as health disparities. We test the ramifications of applying differential privacy to released data by studying estimates of US mortality rates for the overall population and three major racial/ethnic groups. We ask how changes in the denominators of these vital rates due to the implementation of differential privacy can lead to biased estimates. We situate where these changes are most likely to matter by disaggregating biases by population size, degree of urbanization, and adjacency to a metropolitan area. Our results suggest that differential privacy will more strongly affect mortality rate estimates for non-Hispanic blacks and Hispanics than estimates for non-Hispanic whites. We also find significant changes in estimated mortality rates for less populous areas, with more pronounced changes when stratified by race/ethnicity. We find larger changes in estimated mortality rates for areas with lower levels of urbanization or adjacency to metropolitan areas, with these changes being greater for non-Hispanic blacks and Hispanics. These findings highlight the consequences of implementing differential privacy, as proposed, for research examining population composition, particularly mortality disparities across racial/ethnic groups and along the urban/rural continuum. Overall, they demonstrate the challenges in using the data products derived from the proposed disclosure avoidance methods, while highlighting critical instances where scientific understandings may be negatively impacted.
Funder
HHS | NIH | Eunice Kennedy Shriver National Institute of Child Health and Human Development
Publisher
Proceedings of the National Academy of Sciences
Reference58 articles.
1. Can a set of equations keep U.S. census data private?
2. S. L. Garfinkel , Deploying Differential Privacy for the 2020 Census of Population and Housing in Joint Statistical Meetings (US Census Bureau, Washington, DC, 2019).
3. S. L. Garfinkel , J. M. Abowd , S. Powazek , “Issues encountered deploying differential privacy” in Proceedings of the ACM Conference on Computer and Communications Security (ACM, New York, NY, 2018), pp. 133–137.
4. Committee on National Statistics , Workshop on 2020 Census data products: Data needs and privacy considerations. https://www.nationalacademies.org/event/12-11-2019/workshop-on-2020-census-data-products-data-needs-and-privacy-considerations. Accessed 13 February 2020.
5. Differential privacy and census data: Implications for social and economic research;Ruggles;AEA Pap. Proc.,2019
Cited by
45 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献