Precise high-dimensional error analysis of regularized M-estimators

A general approach for estimating an unknown signal x<sub>0</sub> ∈ ℝ<sup>n</sup> from noisy, linear measurements y = Ax<sub>0</sub> + z ∈ ℝ<sup>m</sup> is via solving a so called regularized M-estimator: x̂ := arg minx ℒ(y-Ax)+λf(x). Here, ℒ is a convex loss function, f is a convex (typically, non-smooth) regularizer, and, λ > 0 a regularizer parameter. We analyze the squared error performance ∥x̂ - x<sub>0</sub>∥<sub>2</sub><sup>2</sup> of such estimators in the high-dimensional proportional regime where m, n → ∞ and m/n → δ. We let the design matrix A have entries iid Gaussian, and, impose minimal and rather mild regularity conditions on the loss function, on the regularizer, and, on the distributions of the noise and of the unknown signal. Under such a generic setting, we show that the squared error converges in probability to a nontrivial limit that is computed by solving four nonlinear equations on four scalar unknowns. We identify a new summary parameter, termed the expected Moreau envelope, which determines how the choice of the loss function and of the regularizer affects the error performance. The result opens the way for answering optimality questions regarding the choice of the loss function, the regularizer, the penalty parameter, etc.

[1]  Mihailo Stojnic,et al.  A framework to characterize performance of LASSO algorithms , 2013, ArXiv.

[2]  Christos Thrampoulidis,et al.  The squared-error of generalized LASSO: A precise analysis , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[3]  Pablo A. Parrilo,et al.  The Convex Geometry of Linear Inverse Problems , 2010, Foundations of Computational Mathematics.

[4]  D. Donoho,et al.  Variance Breakdown of Huber (M)-estimators: $n/p \rightarrow m \in (1,\infty)$ , 2015, 1503.02106.

[5]  Joel A. Tropp,et al.  Living on the edge: A geometric theory of phase transitions in convex optimization , 2013, ArXiv.

[6]  Y. Gordon Some inequalities for Gaussian processes and applications , 1985 .

[7]  Andrea Montanari,et al.  The LASSO Risk for Gaussian Matrices , 2010, IEEE Transactions on Information Theory.

[8]  Christos Thrampoulidis,et al.  Precise error analysis of the LASSO , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Noureddine El Karoui,et al.  Asymptotic behavior of unregularized and ridge-regularized high-dimensional robust regression estimators : rigorous results , 2013, 1311.2445.

[10]  W. Newey,et al.  Large sample estimation and hypothesis testing , 1986 .

[11]  Christos Thrampoulidis,et al.  Regularized Linear Regression: A Precise Analysis of the Estimation Error , 2015, COLT.

[12]  Gregory Piatetsky-Shapiro,et al.  High-Dimensional Data Analysis: The Curses and Blessings of Dimensionality , 2000 .

[13]  Rina Foygel,et al.  Corrupted Sensing: Novel Guarantees for Separating Structured Signals , 2013, IEEE Transactions on Information Theory.

[14]  Francis R. Bach,et al.  Structured sparsity-inducing norms through submodular functions , 2010, NIPS.

[15]  Christos Thrampoulidis,et al.  Isotropically random orthogonal matrices: Performance of LASSO and minimum conic singular values , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[16]  A. Belloni,et al.  Square-Root Lasso: Pivotal Recovery of Sparse Signals via Conic Programming , 2010, 1009.5689.

[17]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[18]  Klaus J. Miescke,et al.  Statistical decision theory : estimation, testing, and selection , 2008 .

[19]  Andrea Montanari,et al.  The Noise-Sensitivity Phase Transition in Compressed Sensing , 2010, IEEE Transactions on Information Theory.

[20]  Lie Wang The L1L1 penalized LAD estimator for high dimensional linear regression , 2013, J. Multivar. Anal..

[21]  Bastian Goldlücke,et al.  Variational Analysis , 2014, Computer Vision, A Reference Guide.

[22]  Y. Gordon On Milman's inequality and random subspaces which escape through a mesh in ℝ n , 1988 .

[23]  Christos Thrampoulidis,et al.  Asymptotically exact error analysis for the generalized equation-LASSO , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[24]  Christos Thrampoulidis,et al.  Estimating structured signals in sparse noise: A precise noise sensitivity analysis , 2014, 2014 52nd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[25]  Andrea Montanari,et al.  High dimensional robust M-estimation: asymptotic variance via approximate message passing , 2013, ArXiv.

[26]  B. Ripley,et al.  Robust Statistics , 2018, Wiley Series in Probability and Statistics.