Low-Rank Approximation and Regression in Input Sparsity Time-Reference-Cited by-同舟云学术

Low-Rank Approximation and Regression in Input Sparsity Time

Published:2017-02-09 Issue:6 Volume:63 Page:1-45
ISSN:0004-5411
Container-title:Journal of the ACM
language:en
Short-container-title:J. ACM

Author:

Clarkson Kenneth L.¹,Woodruff David P.¹

Affiliation:

1. IBM Research, Almaden, Harry Road, San Jose, CA

Abstract

We design a new distribution over m × n matrices S so that, for any fixed n × d matrix A of rank r , with probability at least 9/10, ∥ SAx ∥ 2 = (1 ± ε)∥ Ax ∥ 2 simultaneously for all x ∈ R d . Here, m is bounded by a polynomial in r ε − 1 , and the parameter ε ∈ (0, 1]. Such a matrix S is called a subspace embedding . Furthermore, SA can be computed in O (nnz( A )) time, where nnz( A ) is the number of nonzero entries of A . This improves over all previous subspace embeddings, for which computing SA required at least Ω( nd log d ) time. We call these S sparse embedding matrices . Using our sparse embedding matrices, we obtain the fastest known algorithms for overconstrained least-squares regression, low-rank approximation, approximating all leverage scores, and ℓ p regression. More specifically, let b be an n × 1 vector, ε > 0 a small enough value, and integers k , p ⩾ 1. Our results include the following. — Regression: The regression problem is to find d × 1 vector x ′ for which ∥ Ax ′ − b ∥ p ⩽ (1 + ε)min x ∥ Ax − b ∥ p . For the Euclidean case p = 2, we obtain an algorithm running in O (nnz( A )) + Õ ( d 3 ε −2 ) time, and another in O (nnz( A )log(1/ε)) + Õ ( d 3 log (1/ε)) time. (Here, Õ ( f ) = f ċ log O (1) ( f ).) For p ∈ [1, ∞), more generally, we obtain an algorithm running in O (nnz( A ) log n ) + O ( r \ε −1 ) C time, for a fixed C . — Low-rank approximation: We give an algorithm to obtain a rank- k matrix Â k such that ∥ A − Â k ∥ F ≤ (1 + ε )∥ A − A k ∥ F , where A k is the best rank- k approximation to A . (That is, A k is the output of principal components analysis, produced by a truncated singular value decomposition, useful for latent semantic indexing and many other statistical problems.) Our algorithm runs in O (nnz( A )) + Õ ( nk 2 ε −4 + k 3 ε −5 ) time. — Leverage scores: We give an algorithm to estimate the leverage scores of A , up to a constant factor, in O (nnz( A )log n ) + Õ ( r 3 )time.

Funder

Defense Advanced Research Projects Agency

Air Force Research Laboratory

XDATA

Publisher

Association for Computing Machinery (ACM)

Subject

Artificial Intelligence,Hardware and Architecture,Information Systems,Control and Systems Engineering,Software

Link

https://dl.acm.org/doi/pdf/10.1145/3019134

Reference65 articles.

1. Dimitris Achlioptas Amos Fiat Anna R. Karlin and Frank McSherry. 2001. Web search via hub synthesis. In FOCS. 500--509. 10.1109/SFCS.2001.959926 Dimitris Achlioptas Amos Fiat Anna R. Karlin and Frank McSherry. 2001. Web search via hub synthesis. In FOCS. 500--509. 10.1109/SFCS.2001.959926

2. Dimitris Achlioptas and Frank McSherry. 2005. On spectral learning of mixtures of distributions. In COLT. 458--469. 10.1007/11503415_31 Dimitris Achlioptas and Frank McSherry. 2005. On spectral learning of mixtures of distributions. In COLT. 458--469. 10.1007/11503415_31

3. Fast computation of low-rank matrix approximations

4. Nir Ailon and Bernard Chazelle. 2006. Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform. In STOC. 557--563. 10.1145/1132516.1132597 Nir Ailon and Bernard Chazelle. 2006. Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform. In STOC. 557--563. 10.1145/1132516.1132597

5. Sanjeev Arora Elad Hazan and Satyen Kale. 2006. A fast random sampling algorithm for sparsifying matrices. In APPROX-RANDOM. 272--279. 10.1007/11830924_26 Sanjeev Arora Elad Hazan and Satyen Kale. 2006. A fast random sampling algorithm for sparsifying matrices. In APPROX-RANDOM. 272--279. 10.1007/11830924_26

Cited by 110 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Improving compressed matrix multiplication using control variate method;Information Processing Letters;2025-01

2. Accelerated Double-Sketching Subspace Newton;European Journal of Operational Research;2024-12

3. On the Consistency and Large-Scale Extension of Multiple Kernel Clustering;IEEE Transactions on Pattern Analysis and Machine Intelligence;2024-10

4. Recent and Upcoming Developments in Randomized Numerical Linear Algebra for Machine Learning;Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining;2024-08-24

5. Statistical inference for sketching algorithms;Information and Inference: A Journal of the IMA;2024-07-01