A Direct Formulation for Sparse Pca Using Semidefinite Programming

Given a covariance matrix, we consider the problem of maximizing the variance explained by a particular linear combination of the input variables while constraining the number of nonzero coefficients in this combination. This problem arises in the decomposition of a covariance matrix into sparse factors or sparse principal component analysis (PCA), and has wide applications ranging from biology to finance. We use a modification of the classical variational representation of the largest eigenvalue of a symmetric matrix, where cardinality is constrained, and derive a semidefinite programming-based relaxation for our problem. We also discuss Nesterov's smooth minimization technique applied to the semidefinite program arising in the semidefinite relaxation of the sparse PCA problem. The method has complexity $O(n^4 \sqrt{\log(n)}/\epsilon)$, where $n$ is the size of the underlying covariance matrix and $\epsilon$ is the desired absolute accuracy on the optimal value of the problem.

[1]  J. N. R. Jeffers,et al.  Two Case Studies in the Application of Principal Component Analysis , 1967 .

[2]  C. Loan,et al.  Nineteen Dubious Ways to Compute the Exponential of a Matrix , 1978 .

[3]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[4]  Alexander Schrijver,et al.  Cones of Matrices and Set-Functions and 0-1 Optimization , 1991, SIAM J. Optim..

[5]  I. Jolliffe Rotation of principal components: choice of normalization constraints , 1995 .

[6]  Farid Alizadeh,et al.  Interior Point Methods in Semidefinite Programming with Applications to Combinatorial Optimization , 1995, SIAM J. Optim..

[7]  Kim-Chuan Toh,et al.  SDPT3 -- A Matlab Software Package for Semidefinite Programming , 1996 .

[8]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[9]  Claude Lemaréchal,et al.  Practical Aspects of the Moreau-Yosida Regularization: Theoretical Preliminaries , 1997, SIAM J. Optim..

[10]  Roger B. Sidje,et al.  Expokit: a software package for computing matrix exponentials , 1998, TOMS.

[11]  C. Lemaréchal,et al.  Semidefinite Relaxations and Lagrangian Duality with Application to Combinatorial Optimization , 1999 .

[12]  Jos F. Sturm,et al.  A Matlab toolbox for optimization over symmetric cones , 1999 .

[13]  S. Vines Simple principal components , 2000 .

[14]  Franz Rendl,et al.  A Spectral Bundle Method for Semidefinite Programming , 1999, SIAM J. Optim..

[15]  Tamara G. Kolda,et al.  Algorithm 805: computation and uses of the semidiscrete matrix decomposition , 2000, TOMS.

[16]  Stephen P. Boyd,et al.  A rank minimization heuristic with application to minimum order system approximation , 2001, Proceedings of the 2001 American Control Conference. (Cat. No.01CH37148).

[17]  Allan D. Jepson,et al.  Sparse PCA: Extracting Multi-scale Structure from Data , 2001, ICCV.

[18]  Hongyuan Zha,et al.  Low-Rank Approximations with Sparse Factors I: Basic Algorithms and Error Analysis , 2001, SIAM J. Matrix Anal. Appl..

[19]  I. Jolliffe,et al.  A Modified Principal Component Technique Based on the LASSO , 2003 .

[20]  Cleve B. Moler,et al.  Nineteen Dubious Ways to Compute the Exponential of a Matrix, Twenty-Five Years Later , 1978, SIAM Rev..

[21]  Luis Mateus Rocha,et al.  Singular value decomposition and principal component analysis , 2003 .

[22]  Hongyuan Zha,et al.  Low-Rank Approximations with Sparse Factors II: Penalized Methods with Discrete Newton-Like Iterations , 2004, SIAM J. Matrix Anal. Appl..

[23]  Arkadi Nemirovski,et al.  Prox-Method with Rate of Convergence O(1/t) for Variational Inequalities with Lipschitz Continuous Monotone Operators and Smooth Convex-Concave Saddle Point Problems , 2004, SIAM J. Optim..

[24]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[25]  A. d'Aspremont,et al.  Smooth Optimization for Sparse Semidefinite Programs , 2005 .

[26]  Arkadi Nemirovski,et al.  Non-euclidean restricted memory level method for large-scale convex optimization , 2005, Math. Program..

[27]  D. Donoho,et al.  Sparse nonnegative solution of underdetermined linear equations by linear programming. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[28]  R. Tibshirani,et al.  Sparse Principal Component Analysis , 2006 .

[29]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[30]  L. Vandenberghe,et al.  Semideenite Programming , 2022 .

[31]  Yurii Nesterov,et al.  Smoothing Technique and its Applications in Semidefinite Optimization , 2004, Math. Program..

[32]  I. Johnstone,et al.  Sparse Principal Components Analysis , 2009, 0901.4392.