From Nesterov's Estimate Sequence to Riemannian Acceleration

We propose the first global accelerated gradient method for Riemannian manifolds. Toward establishing our result we revisit Nesterov's estimate sequence technique and develop an alternative analysis for it that may also be of independent interest. Then, we extend this analysis to the Riemannian setting, localizing the key difficulty due to non-Euclidean structure into a certain ``metric distortion.'' We control this distortion by developing a novel geometric inequality, which permits us to propose and analyze a Riemannian counterpart to Nesterov's accelerated gradient method.

[1]  A. M. Lyapunov The general problem of the stability of motion , 1992 .

[2]  Sébastien Bubeck,et al.  Introduction to Online Optimization , 2011 .

[3]  Bikash Joshi,et al.  An Explicit Convergence Rate for Nesterov's Method from SDP , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[4]  H. E. Rauch,et al.  A CONTRIBUTION TO DIFFERENTIAL GEOMETRY IN THE LARGE , 1951 .

[5]  Kwangjun Ahn From Proximal Point Method to Nesterov's Acceleration , 2020, ArXiv.

[6]  Michael I. Jordan,et al.  On Symplectic Optimization , 2018, 1802.03653.

[7]  Andre Wibisono,et al.  A variational perspective on accelerated methods in optimization , 2016, Proceedings of the National Academy of Sciences.

[8]  Suvrit Sra,et al.  An Estimate Sequence for Geodesically Convex Optimization , 2018, COLT.

[9]  John Darzentas,et al.  Problem Complexity and Method Efficiency in Optimization , 1983 .

[10]  Jelena Diakonikolas,et al.  The Approximate Duality Gap Technique: A Unified Theory of First-Order Methods , 2017, SIAM J. Optim..

[11]  Zeyuan Allen Zhu,et al.  Linear Coupling: An Ultimate Unification of Gradient and Mirror Descent , 2014, ITCS.

[12]  Jefferson G. Melo,et al.  Iteration-Complexity of Gradient, Subgradient and Proximal Point Methods on Riemannian Manifolds , 2016, Journal of Optimization Theory and Applications.

[13]  Francis Bach,et al.  Stochastic first-order methods: non-asymptotic and computer-aided analyses via potential functions , 2019, COLT.

[14]  Bin Hu,et al.  Dissipativity Theory for Nesterov's Accelerated Method , 2017, ICML.

[15]  Antonio Orvieto,et al.  A Continuous-time Perspective for Modeling Acceleration in Riemannian Optimization , 2020, AISTATS.

[16]  Peter Bürgisser,et al.  Towards a Theory of Non-Commutative Optimization: Geodesic 1st and 2nd Order Methods for Moment Maps and Polytopes , 2019, 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS).

[17]  Suvrit Sra,et al.  First-order Methods for Geodesically Convex Optimization , 2016, COLT.

[18]  Navin Goyal,et al.  Sampling and Optimization on Convex Sets in Riemannian Manifolds of Non-Negative Curvature , 2019, COLT.

[19]  Hiroyuki Kasai,et al.  Riemannian stochastic variance reduced gradient on Grassmann manifold , 2016, ArXiv.

[20]  Marc Teboulle,et al.  Performance of first-order methods for smooth convex minimization: a novel approach , 2012, Math. Program..

[21]  Pan Zhou,et al.  Faster First-Order Methods for Stochastic Non-Convex Optimization on Riemannian Manifolds , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Hong Cheng,et al.  Accelerated First-order Methods for Geodesically Convex Optimization on Riemannian Manifolds , 2017, NIPS.

[23]  M. Bacák Convex Analysis and Optimization in Hadamard Spaces , 2014 .

[24]  Stephen P. Boyd,et al.  A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights , 2014, J. Mach. Learn. Res..

[25]  Bryan Van Scoy,et al.  Lyapunov Functions for First-Order Methods: Tight Automated Convergence Guarantees , 2018, ICML.

[26]  M. Gromov Manifolds of negative curvature , 1978 .

[27]  Asuman E. Ozdaglar,et al.  Robust Accelerated Gradient Methods for Smooth Strongly Convex Functions , 2018, SIAM J. Optim..

[28]  Suvrit Sra,et al.  Nonconvex stochastic optimization on manifolds via Riemannian Frank-Wolfe methods , 2019, ArXiv.

[29]  Sebastian Ehrlichmann,et al.  Metric Spaces Of Non Positive Curvature , 2016 .

[30]  Robert E. Mahony,et al.  Optimization Algorithms on Matrix Manifolds , 2007 .

[31]  Jonathan W. Siegel Accelerated Optimization with Orthogonality Constraints , 2019, 1903.05204.

[32]  Michael I. Jordan,et al.  A Lyapunov Analysis of Momentum Methods in Optimization , 2016, ArXiv.

[33]  J. Jost Riemannian geometry and geometric analysis , 1995 .

[34]  Emmanuel J. Candès,et al.  Adaptive Restart for Accelerated Gradient Schemes , 2012, Foundations of Computational Mathematics.

[35]  Suvrit Sra,et al.  Nonconvex stochastic optimization on manifolds via Riemannian Frank-Wolfe methods , 2019, ArXiv.

[36]  C. Udriste,et al.  Convex Functions and Optimization Methods on Riemannian Manifolds , 1994 .

[37]  Ramsay Dyer,et al.  Riemannian simplices and triangulations , 2015 .

[38]  Hongyi Zhang,et al.  R-SPIDER: A Fast Riemannian Stochastic Optimization Algorithm with Curvature Independent Rate , 2018, ArXiv.

[39]  Bin Shi Acceleration via Symplectic Discretization of High-Resolution Differential Equations , 2019 .

[40]  Suvrit Sra,et al.  Fast stochastic optimization on Riemannian manifolds , 2016, ArXiv.

[41]  Donghwan Kim,et al.  Optimized first-order methods for smooth convex minimization , 2014, Math. Program..

[42]  Anupam Gupta,et al.  Potential-Function Proofs for Gradient Methods , 2019, Theory Comput..

[43]  A. V. Gasnikov,et al.  Universal Method for Stochastic Composite Optimization Problems , 2018 .

[44]  Y. Nesterov A method for unconstrained convex minimization problem with the rate of convergence o(1/k^2) , 1983 .

[45]  R. McCann,et al.  A Riemannian interpolation inequality à la Borell, Brascamp and Lieb , 2001 .

[46]  Benjamin Recht,et al.  Analysis and Design of Optimization Algorithms via Integral Quadratic Constraints , 2014, SIAM J. Optim..

[47]  G. Perelman Spaces with Curvature Bounded Below , 1995 .

[48]  Bin Hu,et al.  A Robust Accelerated Optimization Algorithm for Strongly Convex Functions , 2017, 2018 Annual American Control Conference (ACC).

[49]  Nicolas Boumal,et al.  Adaptive regularization with cubics on manifolds , 2018, Mathematical Programming.