Affiliation:
1. Virginia Tech, Falls Church, VA
2. George Mason University, Fairfax, VA
3. U. S. Army Corps of Engineers, Alexandria, VA
Abstract
The presence of data noise and corruptions has recently invoked increasing attention on robust least-squares regression (
RLSR
), which addresses this fundamental problem that learns reliable regression coefficients when response variables can be arbitrarily corrupted. Until now, the following important challenges could not be handled concurrently: (1) rigorous recovery guarantee of regression coefficients, (2) difficulty in estimating the corruption ratio parameter, and (3) scaling to massive datasets. This article proposes a novel Robust regression algorithm via Heuristic Corruption Thresholding (
RHCT
) that concurrently addresses all the above challenges. Specifically, the algorithm alternately optimizes the regression coefficients and estimates the optimal uncorrupted set via heuristic thresholding without a pre-defined corruption ratio parameter until its convergence. Moreover, to improve the efficiency of corruption estimation in large-scale data, a Robust regression algorithm via Adaptive Corruption Thresholding (
RACT
) is proposed to determine the size of the uncorrupted set in a novel adaptive search method without iterating data samples exhaustively. In addition, we prove that our algorithms benefit from strong guarantees analogous to those of state-of-the-art methods in terms of convergence rates and recovery guarantees. Extensive experiments demonstrate that the effectiveness of our new methods is superior to that of existing methods in the recovery of both regression coefficients and uncorrupted sets, with very competitive efficiency.
Funder
U. S. Military Research Laboratory and the U. S. Military Research Office
Publisher
Association for Computing Machinery (ACM)
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Online and Distributed Robust Regressions with Extremely Noisy Labels;ACM Transactions on Knowledge Discovery from Data;2022-06-30
2. Robust Multi-target Regression for Correlated Data Corruption;2020 IEEE International Conference on Data Mining (ICDM);2020-11