Modular Proximal Optimization for Multidimensional Total-Variation Regularization

We study \emph{TV regularization}, a widely used technique for eliciting structured sparsity. In particular, we propose efficient algorithms for computing prox-operators for $\ell_p$-norm TV. The most important among these is $\ell_1$-norm TV, for whose prox-operator we present a new geometric analysis which unveils a hitherto unknown connection to taut-string methods. This connection turns out to be remarkably useful as it shows how our geometry guided implementation results in efficient weighted and unweighted 1D-TV solvers, surpassing state-of-the-art methods. Our 1D-TV solvers provide the backbone for building more complex (two or higher-dimensional) TV solvers within a modular proximal optimization approach. We review the literature for an array of methods exploiting this strategy, and illustrate the benefits of our modular design through extensive suite of experiments on (i) image denoising, (ii) image deconvolution, (iii) four variants of fused-lasso, and (iv) video denoising. To underscore our claims and permit easy reproducibility, we provide all the reviewed and our new TV solvers in an easy to use multi-threaded C++, Matlab and Python library.

[1]  S. Osher,et al.  Decomposition of images by the anisotropic Rudin‐Osher‐Fatemi model , 2004 .

[2]  A. Rinaldo Properties and refinements of the fused lasso , 2008, 0805.0234.

[3]  Inderjit S. Dhillon,et al.  A scalable trust-region algorithm with application to mixed-norm regression , 2010, ICML.

[4]  Antonin Chambolle,et al.  On the ergodic convergence rates of a first-order primal–dual algorithm , 2016, Math. Program..

[5]  D. Bertsekas Projected Newton methods for optimization problems with simple constraints , 1981, 1981 20th IEEE Conference on Decision and Control including the Symposium on Adaptive Processes.

[6]  Laurent Condat,et al.  A Generic Proximal Algorithm for Convex Optimization—Application to Total Variation Minimization , 2014, IEEE Signal Processing Letters.

[7]  A. Chambolle,et al.  A remark on accelerated block coordinate descent for computing the proximity operators of a sum of convex functions , 2015 .

[8]  Julien Mairal,et al.  Convex optimization with sparsity-inducing norms , 2011 .

[9]  R. Tibshirani,et al.  Spatial smoothing and hot spot detection for CGH data using the fused lasso. , 2008, Biostatistics.

[10]  Jorge Nocedal,et al.  A Limited Memory Algorithm for Bound Constrained Optimization , 1995, SIAM J. Sci. Comput..

[11]  José R. Dorronsoro,et al.  Finding Optimal Model Parameters by Discrete Grid Search , 2008, Innovations in Hybrid Intelligent Systems.

[12]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .

[13]  Suvrit Sra,et al.  Fast Newton methods for the group fused lasso , 2014, UAI.

[14]  B. Martinet Brève communication. Régularisation d'inéquations variationnelles par approximations successives , 1970 .

[15]  Søren Holdt Jensen,et al.  Algorithms and software for total variation image reconstruction via first-order methods , 2009, Numerical Algorithms.

[16]  Ryan J. Tibshirani,et al.  Fast and Flexible ADMM Algorithms for Trend Filtering , 2014, ArXiv.

[17]  Otmar Scherzer,et al.  A Derivative-Free Approach to Total Variation Regularization , 2009 .

[18]  Mingqiang Zhu,et al.  An Efficient Primal-Dual Hybrid Gradient Algorithm For Total Variation Image Restoration , 2008 .

[19]  D. Pinkel,et al.  Regional copy number–independent deregulation of transcription in cancer , 2006, Nature Genetics.

[20]  P. Davies,et al.  Local Extremes, Runs, Strings and Multiresolution , 2001 .

[21]  D. Gleich TRUST REGION METHODS , 2017 .

[22]  Suvrit Sra,et al.  Fast Newton-type Methods for Total Variation Regularization , 2011, ICML.

[23]  Laurent Condat,et al.  A Direct Algorithm for 1-D Total Variation Denoising , 2013, IEEE Signal Processing Letters.

[24]  Julien Mairal,et al.  Network Flow Algorithms for Structured Sparsity , 2010, NIPS.

[25]  R. Rockafellar Monotone Operators and the Proximal Point Algorithm , 1976 .

[26]  José M. Bioucas-Dias,et al.  Total Variation-Based Image Deconvolution: a Majorization-Minimization Approach , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[27]  Qingyang Li,et al.  A Highly Scalable Parallel Algorithm for Isotropic Total Variation Models , 2014, ICML.

[28]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[29]  Antonin Chambolle,et al.  A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[30]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[31]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[32]  Xue-Cheng Tai,et al.  Domain decomposition methods with graph cuts algorithms for total variation minimization , 2012, Adv. Comput. Math..

[33]  Gabriele Steidl,et al.  Anisotropic Smoothing Using Double Orientations , 2009, SSVM.

[34]  Shuiwang Ji,et al.  SLEP: Sparse Learning with Efficient Projections , 2011 .

[35]  Saverio Salzo,et al.  Inexact and accelerated proximal point algorithms , 2011 .

[36]  Stephen J. Wright,et al.  Optimization for Machine Learning , 2013 .

[37]  Curtis R. Vogel,et al.  Iterative Methods for Total Variation Denoising , 1996, SIAM J. Sci. Comput..

[38]  Tom Goldstein,et al.  The Split Bregman Method for L1-Regularized Problems , 2009, SIAM J. Imaging Sci..

[39]  Wotao Yin,et al.  Parametric Maximum Flow Algorithms for Fast Total Variation Minimization , 2009, SIAM J. Sci. Comput..

[40]  Suvrit Sra,et al.  Scalable nonconvex inexact proximal splitting , 2012, NIPS.

[41]  Heinz H. Bauschke,et al.  Finding best approximation pairs relative to two closed convex sets in Hilbert spaces , 2004, J. Approx. Theory.

[42]  Patrick L. Combettes,et al.  Proximal Splitting Methods in Signal Processing , 2009, Fixed-Point Algorithms for Inverse Problems in Science and Engineering.

[43]  Le Song,et al.  Estimating time-varying networks , 2008, ISMB 2008.

[44]  Colin Campbell,et al.  The latent process decomposition of cDNA microarray data sets , 2005, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[45]  Rob Fergus,et al.  Fast Image Deconvolution using Hyper-Laplacian Priors , 2009, NIPS.

[46]  K. Kiwiel Variable Fixing Algorithms for the Continuous Quadratic Knapsack Problem , 2008 .

[47]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[48]  Donald E. Knuth,et al.  The Art of Computer Programming, Volume I: Fundamental Algorithms, 2nd Edition , 1997 .

[49]  Chih-Jen Lin,et al.  Newton's Method for Large Bound-Constrained Optimization Problems , 1999, SIAM J. Optim..

[50]  H. D. Brunk,et al.  Statistical inference under order restrictions : the theory and application of isotonic regression , 1973 .

[51]  Vladimir Kolmogorov,et al.  Total Variation on a Tree , 2015, SIAM J. Imaging Sci..

[52]  Jean-Philippe Vert,et al.  Fast detection of multiple change-points shared by many signals using group LARS , 2010, NIPS.

[53]  L. Rudin,et al.  Nonlinear total variation based noise removal algorithms , 1992 .

[54]  Heinz H. Bauschke,et al.  Convex Analysis and Monotone Operator Theory in Hilbert Spaces , 2011, CMS Books in Mathematics.

[55]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[56]  Antonin Chambolle,et al.  On Total Variation Minimization and Surface Evolution Using Parametric Maximum Flows , 2009, International Journal of Computer Vision.

[57]  Stephen P. Boyd,et al.  Proximal Algorithms , 2013, Found. Trends Optim..

[58]  José M. Bioucas-Dias,et al.  Fast Image Recovery Using Variable Splitting and Constrained Optimization , 2009, IEEE Transactions on Image Processing.

[60]  Jieping Ye,et al.  An efficient ADMM algorithm for multidimensional anisotropic total variation regularization problems , 2013, KDD.

[61]  Laurent Condat,et al.  A Fast Projection onto the Simplex and the l 1 Ball , 2015 .

[62]  José M. Bioucas-Dias,et al.  A New TwIST: Two-Step Iterative Shrinkage/Thresholding Algorithms for Image Restoration , 2007, IEEE Transactions on Image Processing.

[63]  Stephen P. Boyd,et al.  An ADMM Algorithm for a Class of Total Variation Regularized Estimation Problems , 2012, 1203.1828.

[64]  Yuying Li,et al.  A computational algorithm for minimizing total variation in image restoration , 1996, IEEE Trans. Image Process..

[65]  Suvrit Sra,et al.  Reflection methods for user-friendly submodular optimization , 2013, NIPS.

[66]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[67]  Yaoliang Yu,et al.  On Decomposing the Proximal Map , 2013, NIPS.

[68]  Francis R. Bach,et al.  Structured sparsity-inducing norms through submodular functions , 2010, NIPS.

[69]  Nicholas A. Johnson,et al.  A Dynamic Programming Algorithm for the Fused Lasso and L 0-Segmentation , 2013 .

[70]  R. Tibshirani,et al.  PATHWISE COORDINATE OPTIMIZATION , 2007, 0708.1485.

[71]  Jorge J. Moré,et al.  Computing a Trust Region Step , 1983 .

[72]  Emmanuel Barillot,et al.  Classification of arrayCGH data using fused SVM , 2008, ISMB.

[73]  K. Kunisch,et al.  An active set strategy based on the augmented Lagrangian formulation for image restoration , 1999 .

[74]  B. Martinet,et al.  R'egularisation d''in'equations variationnelles par approximations successives , 1970 .

[75]  Laurent Condat Fast projection onto the simplex and the l1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pmb {l}_\mathbf {1}$$\end{ , 2015, Mathematical Programming.

[76]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[77]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[78]  R. Tibshirani,et al.  Sparsity and smoothness via the fused lasso , 2005 .

[79]  Markus Grasmair,et al.  The Equivalence of the Taut String Algorithm and BV-Regularization , 2006, Journal of Mathematical Imaging and Vision.

[80]  R. Tibshirani Adaptive piecewise polynomial estimation via trend filtering , 2013, 1304.2986.

[81]  Francis R. Bach,et al.  Learning with Submodular Functions: A Convex Optimization Perspective , 2011, Found. Trends Mach. Learn..

[82]  Mark W. Schmidt,et al.  Convergence Rates of Inexact Proximal-Gradient Methods for Convex Optimization , 2011, NIPS.

[83]  Emmanuel J. Candès,et al.  Near-Optimal Signal Recovery From Random Projections: Universal Encoding Strategies? , 2004, IEEE Transactions on Information Theory.

[84]  Adam M. Oberman,et al.  Anisotropic Total Variation Regularized L^1-Approximation and Denoising/Deblurring of 2D Bar Codes , 2010, 1007.1035.

[85]  P. L. Combettes Iterative construction of the resolvent of a sum of maximal monotone operators , 2009 .

[86]  Edward R. Dougherty,et al.  Performance of feature-selection methods in the classification of high-dimension data , 2009, Pattern Recognit..

[87]  Stephan Didas,et al.  Relations Between Higher Order TV Regularization and Support Vector Regression , 2005, Scale-Space.

[88]  Jun Liu,et al.  Efficient Euclidean projections in linear time , 2009, ICML '09.

[89]  Stephen P. Boyd,et al.  1 Trend Filtering , 2009, SIAM Rev..

[90]  Han Liu,et al.  Estimation Consistency of the Group Lasso and its Applications , 2009, AISTATS.

[91]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[92]  Jieping Ye,et al.  An efficient algorithm for a class of fused lasso problems , 2010, KDD.

[93]  Stephen J. Wright,et al.  Sparse Reconstruction by Separable Approximation , 2008, IEEE Transactions on Signal Processing.

[94]  Ed Anderson,et al.  LAPACK Users' Guide , 1995 .

[95]  P. Bühlmann,et al.  The group lasso for logistic regression , 2008 .

[96]  José R. Dorronsoro,et al.  Group Fused Lasso , 2013, ICANN.

[97]  J. Moreau Fonctions convexes duales et points proximaux dans un espace hilbertien , 1962 .

[98]  Y. Nesterov Gradient methods for minimizing composite objective function , 2007 .

[99]  Guy Pierra,et al.  Decomposition through formalization in a product space , 1984, Math. Program..

[100]  José R. Dorronsoro,et al.  Finding optimal model parameters by deterministic and annealed focused grid search , 2009, Neurocomputing.

[101]  Suvrit Sra,et al.  Convex Optimization for Parallel Energy Minimization , 2015, ArXiv.

[102]  Martin Jaggi,et al.  Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization , 2013, ICML.

[103]  Z. Harchaoui,et al.  Multiple Change-Point Estimation With a Total Variation Penalty , 2010 .

[104]  Pablo A. Parrilo,et al.  The Convex Geometry of Linear Inverse Problems , 2010, Foundations of Computational Mathematics.