Towards an optimal stochastic alternating direction method of multipliers

We study regularized stochastic convex optimization subject to linear equality constraints. This class of problems was recently also studied by Ouyang et al. (2013) and Suzuki (2013); both introduced similar stochastic alternating direction method of multipliers (SADMM) algorithms. However, the analysis of both papers led to suboptimal convergence rates. This paper presents two new SADMM methods: (i) the first attains the minimax optimal rate of O(1/k) for nonsmooth strongly-convex stochastic problems; while (ii) the second progresses towards an optimalrate by exhibiting an O(1/k2) rate for the smooth part. We present several experiments with our new methods; the results indicate improved performance over competing ADMM methods.

[1]  H. H. Rachford,et al.  On the numerical solution of heat conduction problems in two and three space variables , 1956 .

[2]  R. Rockafellar Monotone Operators and the Proximal Point Algorithm , 1976 .

[3]  Dimitri P. Bertsekas,et al.  On the Douglas—Rachford splitting method and the proximal point algorithm for maximal monotone operators , 1992, Math. Program..

[4]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[5]  Y. Nesterov Gradient methods for minimizing composite objective function , 2007 .

[6]  Alexandre d'Aspremont,et al.  Model Selection Through Sparse Max Likelihood Estimation Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data , 2022 .

[7]  Lin Xiao,et al.  Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization , 2009, J. Mach. Learn. Res..

[8]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[9]  Alexander Shapiro,et al.  Stochastic Approximation approach to Stochastic Programming , 2013 .

[10]  Yoram Singer,et al.  Efficient Online and Batch Learning Using Forward Backward Splitting , 2009, J. Mach. Learn. Res..

[11]  Antonin Chambolle,et al.  A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[12]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[13]  Xi Chen,et al.  Optimal Regularized Dual Averaging Methods for Stochastic Optimization , 2012, NIPS.

[14]  Saeed Ghadimi,et al.  Optimal Stochastic Approximation Algorithms for Strongly Convex Stochastic Composite Optimization I: A Generic Algorithmic Framework , 2012, SIAM J. Optim..

[15]  Donald Goldfarb,et al.  2 A Variable-Splitting Augmented Lagrangian Framework , 2011 .

[16]  Mark W. Schmidt,et al.  A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method , 2012, ArXiv.

[17]  Tim Kraska,et al.  MLbase: A Distributed Machine-learning System , 2013, CIDR.

[18]  Alexander G. Gray,et al.  Stochastic Alternating Direction Method of Multipliers , 2013, ICML.

[19]  Ohad Shamir,et al.  Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes , 2012, ICML.

[20]  Andrew Cotter,et al.  Stochastic Optimization for Machine Learning , 2013, ArXiv.

[21]  Raymond H. Chan,et al.  Constrained Total Variation Deblurring Models and Fast Algorithms Based on Alternating Direction Method of Multipliers , 2013, SIAM J. Imaging Sci..

[22]  s-taiji Dual Averaging and Proximal Gradient Descent for Online Alternating Direction Multiplier Method , 2013 .

[23]  Shiqian Ma,et al.  Fast alternating linearization methods for minimizing the sum of two convex functions , 2009, Math. Program..

[24]  Stephen P. Boyd,et al.  Proximal Algorithms , 2013, Found. Trends Optim..

[25]  Yurii Nesterov,et al.  First-order methods of smooth convex optimization with inexact oracle , 2013, Mathematical Programming.

[26]  Bingsheng He,et al.  On non-ergodic convergence rate of Douglas–Rachford alternating direction method of multipliers , 2014, Numerische Mathematik.