Differentially Private Simple Linear Regression

Author:

Alabi Daniel1,McMillan Audra2,Sarathy Jayshree1,Smith Adam3,Vadhan Salil1

Affiliation:

1. Harvard John A. Paulson School of Engineering and Applied Sciences

2. Khoury College of Computer Sciences , Northeastern University and Department of Computer Science, Boston University

3. Department of Computer Science , Boston University

Abstract

Abstract Economics and social science research often require analyzing datasets of sensitive personal information at fine granularity, with models fit to small subsets of the data. Unfortunately, such fine-grained analysis can easily reveal sensitive individual information. We study regression algorithms that satisfy differential privacy, a constraint which guarantees that an algorithm’s output reveals little about any individual input data record, even to an attacker with side information about the dataset. Motivated by the Opportunity Atlas, a high-profile, small-area analysis tool in economics research, we perform a thorough experimental evaluation of differentially private algorithms for simple linear regression on small datasets with tens to hundreds of records—a particularly challenging regime for differential privacy. In contrast, prior work on differentially private linear regression focused on multivariate linear regression on large datasets or asymptotic analysis. Through a range of experiments, we identify key factors that affect the relative performance of the algorithms. We find that algorithms based on robust estimators—in particular, the median-based estimator of Theil and Sen—perform best on small datasets (e.g., hundreds of datapoints), while algorithms based on Ordinary Least Squares or Gradient Descent perform better for large datasets. However, we also discuss regimes in which this general finding does not hold. Notably, the differentially private analogues of Theil–Sen (one of which was suggested in a theoretical work of Dwork and Lei) have not been studied in any prior experimental work on differentially private linear regression.

Publisher

Privacy Enhancing Technologies Symposium Advisory Board

Subject

General Medicine

Reference32 articles.

1. [1] Jacob Abernethy, Chansoo Lee, and Ambuj Tewari. 2016. Perturbation techniques in online learning and optimization. Perturbations, Optimization, and Statistics (2016), 233.

2. [2] Oguz Akbiligic, Hamparsum Bozdogan, and M. Erdal Balaban. 2013. A novel Hybrid RBF Neural Networks model as a forecaster. Statistics and Computing (2013). This dataset was collected from imkb.gov.tr and finance.yahoo.com.

3. [3] Hilal Asi and John C Duchi. 2020. Instance-optimality in differential privacy via approximate inverse sensitivity mechanisms. Advances in Neural Information Processing Systems 33 (2020).

4. [4] Jordan Awan and Aleksandra Slavković. 2020. Structure and sensitivity in differential privacy: Comparing k-norm mechanisms. J. Amer. Statist. Assoc. just-accepted (2020), 1–56.

5. [5] Emily Badger and Quoctrung Bui. 2020. Detailed Maps Show How Neighborhoods Shape Children for Life. https://www.nytimes.com/2018/10/01/upshot/maps-neighborhoods-shape-child-poverty.html. Online; accessed 15 October 2020.

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Improved Differentially Private Regression via Gradient Boosting;2024 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML);2024-04-09

2. Differentially Private Block Coordinate Descent for Linear Regression on Vertically Partitioned Data;Journal of Cybersecurity and Privacy;2022-11-09

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3