Linear regression with partially mismatched data: local search with theoretical guarantees-Reference-Cited by-同舟云学术

Linear regression with partially mismatched data: local search with theoretical guarantees

Published:2022-08-17 Issue:2 Volume:197 Page:1265-1303
ISSN:0025-5610
Container-title:Mathematical Programming
language:en
Short-container-title:Math. Program.

Author:

Mazumder Rahul^ORCID,Wang Haoyue

Abstract

AbstractLinear regression is a fundamental modeling tool in statistics and related fields. In this paper, we study an important variant of linear regression in which the predictor-response pairs are partially mismatched. We use an optimization formulation to simultaneously learn the underlying regression coefficients and the permutation corresponding to the mismatches. The combinatorial structure of the problem leads to computational challenges. We propose and study a simple greedy local search algorithm for this optimization problem that enjoys strong theoretical guarantees and appealing computational performance. We prove that under a suitable scaling of the number of mismatched pairs compared to the number of samples and features, and certain assumptions on problem data; our local search algorithm converges to a nearly-optimal solution at a linear rate. In particular, in the noiseless case, our algorithm converges to the global optimal solution with a linear convergence rate. Based on this result, we prove an upper bound for the estimation error of the parameter. We also propose an approximate local search step that allows us to scale our approach to much larger instances. We conduct numerical experiments to gather further insights into our theoretical results, and show promising performance gains compared to existing approaches.

Funder

Office of Naval Research

National Science Foundation

Publisher

Springer Science and Business Media LLC

Subject

General Mathematics,Software

Link

https://link.springer.com/content/pdf/10.1007/s10107-022-01863-y.pdf

Reference26 articles.

1. Abid, A., Poon, A., Zou, J.: Linear regression with shuffled labels. arXiv preprint arXiv:1705.01342 (2017)

2. Abid, A., Zou, J.: Stochastic EM for shuffled linear regression. arXiv preprint arXiv:1804.00681 (2018)

3. Balakrishnan, A.V.: On the problem of time jitter in sampling. IRE Transactions on Information Theory 8(3), 226–236 (1962)

4. Blackman, S.S.: Multiple-target tracking with radar applications. Artech House, Norwood, MA (1986)

5. DeGroot, M.H., Feder, P.I., Goel, P.K.: Matchmaking. Ann. Math. Stat. 42(2), 578–593 (1971)