Minkovskian Gradient for Sparse Optimization

Information geometry is used to elucidate convex optimization problems under L1 constraint. A convex function induces a Riemannian metric and two dually coupled affine connections in the manifold of parameters of interest. A generalized Pythagorean theorem and projection theorem hold in such a manifold. An extended LARS algorithm, applicable to both under-determined and over-determined cases, is studied and properties of its solution path are given. The algorithm is shown to be a Minkovskian gradient-descent method, which moves in the steepest direction of a target function under the Minkovskian L1 norm. Two dually coupled affine coordinate systems are useful for analyzing the solution path.

[1]  Michael Elad,et al.  Sparse and Redundant Representations - From Theory to Applications in Signal and Image Processing , 2010 .

[2]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[3]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[4]  E.J. Candes,et al.  An Introduction To Compressive Sampling , 2008, IEEE Signal Processing Magazine.

[5]  Fumiyasu Komaki,et al.  An Extension of Least Angle Regression Based on the Information Geometry of Dually Flat Spaces , 2009 .

[6]  Yaakov Tsaig,et al.  Fast Solution of $\ell _{1}$ -Norm Minimization Problems When the Solution May Be Sparse , 2008, IEEE Transactions on Information Theory.

[7]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[8]  Zongben Xu,et al.  $L_{1/2}$ Regularization: A Thresholding Representation Theory and a Fast Solver , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[9]  S. Frick,et al.  Compressed Sensing , 2014, Computer Vision, A Reference Guide.

[10]  H. Rund The Differential Geometry of Finsler Spaces , 1959 .

[11]  E. Candès,et al.  Stable signal recovery from incomplete and inaccurate measurements , 2005, math/0503066.

[12]  Shun-ichi Amari,et al.  Methods of information geometry , 2000 .

[13]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[14]  A. Banerjee Convex Analysis and Optimization , 2006 .

[15]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[16]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[17]  Michael Elad,et al.  From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images , 2009, SIAM Rev..

[18]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[19]  S. Amari,et al.  Information geometry of divergence functions , 2010 .

[20]  Emmanuel J. Candès,et al.  Near-Optimal Signal Recovery From Random Projections: Universal Encoding Strategies? , 2004, IEEE Transactions on Information Theory.