First-order methods almost always avoid strict saddle points
暂无分享,去创建一个
Michael I. Jordan | Max Simchowitz | Benjamin Recht | Georgios Piliouras | Jason D. Lee | Ioannis Panageas | B. Recht | J. Lee | Ioannis Panageas | G. Piliouras | Max Simchowitz
[1] Philip E. Gill,et al. Newton-type methods for unconstrained and linearly constrained optimization , 1974, Math. Program..
[2] Danny C. Sorensen,et al. On the use of directions of negative curvature in a modified newton method , 1979, Math. Program..
[3] E. Akin,et al. Dynamics of games and genes: Discrete versus continuous time , 1983 .
[4] M. Shub. Global Stability of Dynamical Systems , 1986 .
[5] Katta G. Murty,et al. Some NP-complete problems in quadratic and nonlinear programming , 1987, Math. Program..
[6] R. Pemantle,et al. Nonconvergence to Unstable Points in Urn Models and Stochastic Approximations , 1990 .
[7] Nicholas I. M. Gould,et al. Trust Region Methods , 2000, MOS-SIAM Series on Optimization.
[8] P. Mikusinski,et al. An Introduction to Multivariable Analysis from Vector to Manifold , 2001 .
[9] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.
[10] R. Steele,et al. Optimization , 2005, Encyclopedia of Biometrics.
[11] Robert E. Mahony,et al. Convergence of the Iterates of Descent Methods for Analytic Cost Functions , 2005, SIAM J. Optim..
[12] Yurii Nesterov,et al. Cubic regularization of Newton method and its global performance , 2006, Math. Program..
[13] Robert E. Mahony,et al. Optimization Algorithms on Matrix Manifolds , 2007 .
[14] R. Adler,et al. Random Fields and Geometry , 2007 .
[15] Adrian S. Lewis,et al. Alternating Projections on Manifolds , 2008, Math. Oper. Res..
[16] J. Bolte,et al. Characterizations of Lojasiewicz inequalities: Subgradient flows, talweg, convexity , 2009 .
[17] Éva Tardos,et al. Multiplicative updates outperform generic no-regret learning in congestion games: extended abstract , 2009, STOC '09.
[18] Andrea Montanari,et al. Matrix completion from a few entries , 2009, 2009 IEEE International Symposium on Information Theory.
[19] Hédy Attouch,et al. Proximal Alternating Minimization and Projection Methods for Nonconvex Problems: An Approach Based on the Kurdyka-Lojasiewicz Inequality , 2008, Math. Oper. Res..
[20] Antonio Auffinger,et al. Random Matrices and Complexity of Spin Glasses , 2010, 1003.1129.
[21] Jérôme Malick,et al. Projection-like Retractions on Matrix Manifolds , 2012, SIAM J. Optim..
[22] Benar Fux Svaiter,et al. Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward–backward splitting, and regularized Gauss–Seidel methods , 2013, Math. Program..
[23] Robert E. Mahony,et al. An Extrinsic Look at the Riemannian Hessian , 2013, GSI.
[24] Surya Ganguli,et al. On the saddle point problem for non-convex optimization , 2014, ArXiv.
[25] Surya Ganguli,et al. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.
[26] Peter Richtárik,et al. Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function , 2011, Mathematical Programming.
[27] Xi Chen,et al. Spectral Methods Meet EM: A Provably Optimal Algorithm for Crowdsourcing , 2014, J. Mach. Learn. Res..
[28] Marc Teboulle,et al. Proximal alternating linearized minimization for nonconvex and nonsmooth problems , 2013, Mathematical Programming.
[29] Sanjeev Arora,et al. Simple, Efficient, and Neural Algorithms for Sparse Coding , 2015, COLT.
[30] Xiaodong Li,et al. Optimal Rates of Convergence for Noisy Sparse Phase Retrieval via Thresholded Wirtinger Flow , 2015, ArXiv.
[31] Furong Huang,et al. Escaping From Saddle Points - Online Stochastic Gradient for Tensor Decomposition , 2015, COLT.
[32] Yann LeCun,et al. The Loss Surfaces of Multilayer Networks , 2014, AISTATS.
[33] Xiaodong Li,et al. Phase Retrieval via Wirtinger Flow: Theory and Algorithms , 2014, IEEE Transactions on Information Theory.
[34] John Wright,et al. A Geometric Analysis of Phase Retrieval , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).
[35] Mikhail Belkin,et al. Basis Learning as an Algorithmic Primitive , 2014, COLT.
[36] John Wright,et al. A Geometric Analysis of Phase Retrieval , 2016, International Symposium on Information Theory.
[37] Nathan Srebro,et al. Global Optimality of Local Search for Low Rank Matrix Recovery , 2016, NIPS.
[38] Michael I. Jordan,et al. Gradient Descent Only Converges to Minimizers , 2016, COLT.
[39] Tengyu Ma,et al. Matrix Completion has No Spurious Local Minimum , 2016, NIPS.
[40] Georgios Piliouras,et al. Gradient Descent Only Converges to Minimizers: Non-Isolated Critical Points and Invariant Regions , 2016, ITCS.
[41] Yi Zheng,et al. No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis , 2017, ICML.
[42] Complete Dictionary Recovery Over the Sphere I: Overview and the Geometric Picture , 2015, IEEE Trans. Inf. Theory.
[43] Amir Beck,et al. First-Order Methods in Optimization , 2017 .
[44] John Wright,et al. Complete Dictionary Recovery Over the Sphere II: Recovery by Riemannian Trust-Region Method , 2015, IEEE Transactions on Information Theory.
[45] Amir Globerson,et al. Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs , 2017, ICML.
[46] Marc Teboulle,et al. First Order Methods beyond Convexity and Lipschitz Gradient Continuity with Applications to Quadratic Inverse Problems , 2017, SIAM J. Optim..
[47] Adel Javanmard,et al. Theoretical Insights Into the Optimization Landscape of Over-Parameterized Shallow Neural Networks , 2017, IEEE Transactions on Information Theory.