Fundamental limits in structured principal component analysis and how to reach them-Reference-Cited by-同舟云学术

Fundamental limits in structured principal component analysis and how to reach them

Published:2023-07-18 Issue:30 Volume:120 Page:
ISSN:0027-8424
Container-title:Proceedings of the National Academy of Sciences
language:en
Short-container-title:Proc. Natl. Acad. Sci. U.S.A.

Author:

Barbier Jean¹^ORCID,Camilli Francesco¹^ORCID,Mondelli Marco²^ORCID,Sáenz Manuel³^ORCID

Affiliation:

1. Quantitative Life Sciences and Mathematics Sections, International Centre for Theoretical Physics, Trieste 34151, Italy

2. Institute of Science and Technology Austria, Klosterneuburg 3400, Austria

3. Centro de Matemática, Universidad de La República, Montevideo 11400, Uruguay

Abstract

How do statistical dependencies in measurement noise influence high-dimensional inference? To answer this, we study the paradigmatic spiked matrix model of principal components analysis (PCA), where a rank-one matrix is corrupted by additive noise. We go beyond the usual independence assumption on the noise entries, by drawing the noise from a low-order polynomial orthogonal matrix ensemble. The resulting noise correlations make the setting relevant for applications but analytically challenging. We provide characterization of the Bayes optimal limits of inference in this model. If the spike is rotation invariant, we show that standard spectral PCA is optimal. However, for more general priors, both PCA and the existing approximate message-passing algorithm (AMP) fall short of achieving the information-theoretic limits, which we compute using the replica method from statistical physics. We thus propose an AMP, inspired by the theory of adaptive Thouless–Anderson–Palmer equations, which is empirically observed to saturate the conjectured theoretical limit. This AMP comes with a rigorous state evolution analysis tracking its performance. Although we focus on specific noise distributions, our methodology can be generalized to a wide class of trace matrix ensembles at the cost of more involved expressions. Finally, despite the seemingly strong assumption of rotation-invariant noise, our theory empirically predicts algorithmic performance on real data, pointing at strong universality properties.

Funder

EC | ERC | HORIZON EUROPE European Research Council

Lopez-Lorera prize

Publisher

Proceedings of the National Academy of Sciences

Subject

Multidisciplinary

Link

https://pnas.org/doi/pdf/10.1073/pnas.2302028120

Reference52 articles.

1. Compressed sensing

2. Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information