论文信息 - Optimization with Momentum: Dynamical, Control-Theoretic, and Symplectic Perspectives

Optimization with Momentum: Dynamical, Control-Theoretic, and Symplectic Perspectives

We analyze the convergence rate of various momentum-based optimization algorithms from a dynamical systems point of view. Our analysis exploits fundamental topological properties, such as the continuous dependence of iterates on their initial conditions, to provide a simple characterization of convergence rates. In many cases, closed-form expressions are obtained that relate algorithm parameters to the convergence rate. The analysis encompasses discrete time and continuous time, as well as time-invariant and time-variant formulations, and is not limited to a convex or Euclidean setting. In addition, the article rigorously establishes why symplectic discretization schemes are important for momentum-based optimization algorithms, and provides a characterization of algorithms that exhibit accelerated convergence.

Michael I. Jordan | Michael Muehlebach | Michael Muehlebach

[1] Alexandre M. Bayen,et al. Accelerated Mirror Descent in Continuous and Discrete Time , 2015, NIPS.

[2] C. Ebenbauer,et al. On a Class of Smooth Optimization Algorithms with Applications in Control , 2012 .

[3] Daniel P. Robinson,et al. Conformal symplectic and relativistic optimization , 2019, NeurIPS.

[4] Michael I. Jordan,et al. Stochastic Gradient Descent Escapes Saddle Points Efficiently , 2019, ArXiv.

[5] Stephen P. Boyd,et al. A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights , 2014, J. Mach. Learn. Res..

[6] W. Rudin. Principles of mathematical analysis , 1964 .

[7] Michael I. Jordan,et al. Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent , 2017, COLT.

[8] Alexandre d'Aspremont,et al. Integration Methods and Optimization Algorithms , 2017, NIPS.

[9] Ravi P. Agarwal,et al. Difference equations and inequalities , 1992 .

[10] Juan Peypouquet,et al. Fast convergence of inertial dynamics and algorithms with asymptotic vanishing viscosity , 2018, Math. Program..

[11] Simon Michalowsky,et al. Robust and structure exploiting optimisation algorithms: an integral quadratic constraint approach , 2019, Int. J. Control.