Abstract
AbstractBackgroundLow-rank approximation is a very useful approach for interpreting the features of a correlation matrix; however, a low-rank approximation may result in estimation far from zero even if the corresponding original value was zero. In this case, the results lead to misinterpretation.MethodsTo overcome these problems, we propose a new approach to estimate a sparse low-rank correlation matrix based on threshold values combined with cross-validation. In the proposed approach, the MM algorithm was used to estimate the sparse low-rank correlation matrix, and a grid search was performed to select the threshold values related to sparse estimation.ResultsThrough numerical simulation, we found that the FPR and average relative error of the proposed method were superior to those of the tandem approach. For the application of microarray gene expression, the FPRs of the proposed approach with d = 2, 3, and 5 were 0.128, 0.139, and 0.197, respectively, while FPR of the tandem approach was 0.285.ConclusionsWe propose a novel approach to estimate sparse low-rank correlation matrix. The advantage of the proposed method is that it provides results that are easy to interpret and avoid misunderstandings. We demonstrated the superiority of the proposed method through both numerical simulations and real examples.
Publisher
Cold Spring Harbor Laboratory
Reference34 articles.
1. The history of the cluster heat map;The American Statistician,2009
2. Exploring the within- and between-class correlation distributions for tumor classification
3. J.M.F. ten Berge . Least squares optimization in multivariate analysis. Keiden:DSWO Press, 1993.
4. Rank reduction of correlation matrices by majorization;Quant.Finance,2004
5. A majorization algorithm for constrained approximation;Linear Algebra and its Application,2010