A Generic Approach for Escaping Saddle points
暂无分享,去创建一个
Alexander J. Smola | Barnabás Póczos | Suvrit Sra | Ruslan Salakhutdinov | Francis R. Bach | Manzil Zaheer | Sashank J. Reddi | R. Salakhutdinov | M. Zaheer | F. Bach | Alex Smola | B. Póczos | S. Sra
[1] Harold J. Kushner,et al. wchastic. approximation methods for constrained and unconstrained systems , 1978 .
[2] L. Bottou. Stochastic Gradient Learning in Neural Networks , 1991 .
[3] Barak A. Pearlmutter. Fast Exact Multiplication by the Hessian , 1994, Neural Computation.
[4] Tamer Basar,et al. Analysis of Recursive Stochastic Algorithms , 2001 .
[5] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.
[6] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.
[7] Yurii Nesterov,et al. Cubic regularization of Newton method and its global performance , 2006, Math. Program..
[8] H. Robbins. A Stochastic Approximation Method , 1951 .
[9] Geoffrey E. Hinton. Reducing the Dimensionality of Data with Neural , 2008 .
[10] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.
[11] Daniel Povey,et al. Krylov Subspace Descent for Deep Learning , 2011, AISTATS.
[12] Shai Shalev-Shwartz,et al. Stochastic dual coordinate ascent methods for regularized loss , 2012, J. Mach. Learn. Res..
[13] Tong Zhang,et al. Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.
[14] Saeed Ghadimi,et al. Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming , 2013, SIAM J. Optim..
[15] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[16] Justin Domke,et al. Finito: A faster, permutable incremental gradient method for big data problems , 2014, ICML.
[17] Francis Bach,et al. SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives , 2014, NIPS.
[18] Yann LeCun,et al. The Loss Surface of Multilayer Networks , 2014, ArXiv.
[19] Surya Ganguli,et al. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.
[20] Yoshua Bengio,et al. Equilibrated adaptive learning rates for non-convex optimization , 2015, NIPS.
[21] Léon Bottou,et al. A Lower Bound for the Optimization of Finite Sums , 2014, ICML.
[22] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[23] Furong Huang,et al. Escaping From Saddle Points - Online Stochastic Gradient for Tensor Decomposition , 2015, COLT.
[24] Yijun Huang,et al. Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization , 2015, NIPS.
[25] Roger B. Grosse,et al. Optimizing Neural Networks with Kronecker-factored Approximate Curvature , 2015, ICML.
[26] Alexander J. Smola,et al. On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants , 2015, NIPS.
[27] Jie Liu,et al. Mini-Batch Semi-Stochastic Gradient Descent in the Proximal Setting , 2014, IEEE Journal of Selected Topics in Signal Processing.
[28] Alexander J. Smola,et al. Fast Stochastic Methods for Nonsmooth Nonconvex Optimization , 2016, ArXiv.
[29] Alexander J. Smola,et al. Fast Incremental Method for Nonconvex Optimization , 2016, ArXiv.
[30] Yair Carmon,et al. Accelerated Methods for Non-Convex Optimization , 2016, SIAM J. Optim..
[31] Kfir Y. Levy,et al. The Power of Normalization: Faster Evasion of Saddle Points , 2016, ArXiv.
[32] Alexander J. Smola,et al. Stochastic Variance Reduction for Nonconvex Optimization , 2016, ICML.
[33] Tengyu Ma,et al. Finding Approximate Local Minima for Nonconvex Optimization in Linear Time , 2016, ArXiv.
[34] Naman Agarwal,et al. Second Order Stochastic Optimization in Linear Time , 2016, ArXiv.
[35] Mark W. Schmidt,et al. Minimizing finite sums with the stochastic average gradient , 2013, Mathematical Programming.
[36] Michael I. Jordan,et al. How to Escape Saddle Points Efficiently , 2017, ICML.
[37] Yi Zhou,et al. An optimal randomized incremental gradient method , 2015, Mathematical Programming.
[38] Katya Scheinberg,et al. Global convergence rate analysis of unconstrained optimization methods based on probabilistic models , 2015, Mathematical Programming.