High-dimensional linear regression via implicit regularization-Reference-Cited by-同舟云学术

High-dimensional linear regression via implicit regularization

Published:2022-02-11 Issue:4 Volume:109 Page:1033-1046
ISSN:0006-3444
Container-title:Biometrika
language:en
Short-container-title:

Author:

Zhao Peng¹^ORCID,Yang Yun²,He Qiao-Chu³

Affiliation:

1. Texas A&M University Department of Statistics, , 400 Bizzell St, College Station, Texas 77843, USA

2. University of Illinois Urbana-Champaign Department of Statistics, , 725 South Wright Street, Champaign, Illinois 61820, USA

3. Southern University of Science and Technology School of Business, , 1088 Xueyuan Boulevard, Shenzhen 518055, China

Abstract

Summary Many statistical estimators for high-dimensional linear regression are $M$-estimators, formed through minimizing a data-dependent square loss function plus a regularizer. This work considers a new class of estimators implicitly defined through a discretized gradient dynamic system under overparameterization. We show that, under suitable restricted isometry conditions, overparameterization leads to implicit regularization: if we directly apply gradient descent to the residual sum of squares with sufficiently small initial values then, under some proper early stopping rule, the iterates converge to a nearly sparse rate-optimal solution that improves over explicitly regularized approaches. In particular, the resulting estimator does not suffer from extra bias due to explicit penalties, and can achieve the parametric root-$n$ rate when the signal-to-noise ratio is sufficiently high. We also perform simulations to compare our methods with high-dimensional linear regression with explicit regularization. Our results illustrate the advantages of using implicit regularization via gradient descent after overparameterization in sparse vector estimation.

Publisher

Oxford University Press (OUP)

Subject

Applied Mathematics,Statistics, Probability and Uncertainty,General Agricultural and Biological Sciences,Agricultural and Biological Sciences (miscellaneous),General Mathematics,Statistics and Probability

Link

https://academic.oup.com/biomet/advance-article-pdf/doi/10.1093/biomet/asac010/43557209/asac010.pdf

Reference29 articles.

1. Simultaneous analysis of lasso and Dantzig selector;Bickel,;Ann. Statist.,2009

2. Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection;Breheny,;Ann. Appl. Statist.,2011

3. High-dimensional statistics with a view toward applications in biology;Bühlmann,;Ann. Rev. Statist. Appl.,2014

4. The restricted isometry property and its implications for compressed sensing;Candès,;C. R. Math.,2008

5. The Dantzig selector: statistical estimation when $p$ is much larger than $n$;Candès,;Ann. Statist.,2007

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. COMBSS: best subset selection via continuous optimization;Statistics and Computing;2024-02-12

2. The statistical complexity of early-stopped mirror descent;Information and Inference: A Journal of the IMA;2023-09-18

3. Architectural optimization and feature learning for high-dimensional time series datasets;Physical Review D;2023-01-25