Error bounds for Bregman denoising and structured natural parameter estimation

We analyze an estimator based on the Bregman divergence for recovery of structured models from additive noise. The estimator can be seen as a regularized maximum likelihood estimator for an exponential family where the natural parameter is assumed to be structured. For all such Bregman denoising estimators, we provide an error bound for a natural associated error measure. Our error bound makes it possible to analyze a wide range of estimators, such as those in proximal denoising and inverse covariance matrix estimation, in a unified manner. In the case of proximal denoising, we exactly recover the existing tight normalized mean squared error bounds. In sparse precision matrix estimation, our bounds provide optimal scaling with interpretable constants in terms of the associated error measure.

[1]  L. Bregman The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .

[2]  Y. Censor,et al.  Proximal minimization algorithm withD-functions , 1992 .

[3]  Sanjoy Dasgupta,et al.  A Generalization of Principal Components Analysis to the Exponential Family , 2001, NIPS.

[4]  Heinz H. Bauschke,et al.  Bregman Monotone Optimization Algorithms , 2003, SIAM J. Control. Optim..

[5]  Inderjit S. Dhillon,et al.  Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..

[6]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[7]  S. Geer HIGH-DIMENSIONAL GENERALIZED LINEAR MODELS AND THE LASSO , 2008, 0804.0703.

[8]  Nenghai Yu,et al.  Learning Bregman Distance Functions and Its Application for Semi-Supervised Clustering , 2009, NIPS.

[9]  Chunming Zhang,et al.  Penalized Bregman divergence for large-dimensional regression and classification. , 2010, Biometrika.

[10]  Ambuj Tewari,et al.  Learning Exponential Families in High-Dimensions: Strong Convexity and Sparsity , 2009, AISTATS.

[11]  Martin J. Wainwright,et al.  Fast global convergence rates of gradient methods for high-dimensional statistical recovery , 2010, NIPS.

[12]  Lu Li,et al.  An inexact interior point method for L1-regularized sparse covariance selection , 2010, Math. Program. Comput..

[13]  Michael Elad,et al.  The Cosparse Analysis Model and Algorithms , 2011, ArXiv.

[14]  Pradeep Ravikumar,et al.  Sparse inverse covariance matrix estimation using quadratic approximation , 2011, MLSLP.

[15]  Pablo A. Parrilo,et al.  The Convex Geometry of Linear Inverse Problems , 2010, Foundations of Computational Mathematics.

[16]  Daniel J. Hsu,et al.  Tail inequalities for sums of random matrices that depend on the intrinsic dimension , 2012 .

[17]  Michael I. Jordan,et al.  Computational and statistical tradeoffs via convex relaxation , 2012, Proceedings of the National Academy of Sciences.

[18]  Michèle Basseville,et al.  Divergence measures for statistical data processing - An annotated bibliography , 2013, Signal Process..

[19]  Joel A. Tropp,et al.  Living on the edge: phase transitions in convex programs with random data , 2013, 1303.6672.

[20]  Caroline Uhler,et al.  Maximum likelihood estimation for linear Gaussian covariance models , 2014, 1408.5604.

[21]  Babak Hassibi,et al.  Asymptotically Exact Denoising in Relation to Compressed Sensing , 2013, ArXiv.

[22]  Amin Jalali,et al.  Convex Optimization Algorithms and Statistical Bounds for Learning Structured Models , 2016 .

[23]  Chunming Zhang,et al.  Screening-based Bregman divergence estimation with NP-dimensionality , 2016 .