Affiliation:
1. Department of Statistics, Indiana University , Myles Brand Hall E104, 901 E. 10th Street , Bloomington, Indiana 47408, USA
Abstract
Summary
In this paper, we develop an eigenvector-assisted estimation framework for a collection of signal-plus-noise matrix models arising in high-dimensional statistics and many applications. The framework is built upon a novel asymptotically unbiased estimating equation using the leading eigenvectors of the data matrix. However, the estimator obtained by directly solving the estimating equation could be numerically unstable in practice and lacks robustness against model misspecification. We propose to use the quasi-posterior distribution by exponentiating a criterion function whose maximizer coincides with the estimating equation estimator. The proposed framework can incorporate heteroskedastic variance information, but does not require the complete specification of the sampling distribution and is also robust to the potential misspecification of the distribution of the noise matrix. Computationally, the quasi-posterior distribution can be obtained via a Markov chain Monte Carlo sampler, which exhibits superior numerical stability over some of the existing optimization-based estimators and is straightforward for uncertainty quantification. Under mild regularity conditions, we establish the large sample properties of the quasi-posterior distributions. In particular, the quasi-posterior credible sets have the correct frequentist nominal coverage probability provided that the criterion function is carefully selected. The validity and usefulness of the proposed framework are demonstrated through the analysis of synthetic datasets and the real-world ENZYMES network datasets.
Publisher
Oxford University Press (OUP)
Subject
Applied Mathematics,Statistics, Probability and Uncertainty,General Agricultural and Biological Sciences,Agricultural and Biological Sciences (miscellaneous),General Mathematics,Statistics and Probability