Fast Second-Order Gradient Descent via O(n) Curvature Matrix-Vector Products
暂无分享,去创建一个
[1] Kenneth Levenberg. A METHOD FOR THE SOLUTION OF CERTAIN NON – LINEAR PROBLEMS IN LEAST SQUARES , 1944 .
[2] D. Marquardt. An Algorithm for Least-Squares Estimation of Nonlinear Parameters , 1963 .
[3] P. J. Werbos,et al. Backpropagation: past and future , 1988, IEEE 1988 International Conference on Neural Networks.
[4] F. A. Seiler,et al. Numerical Recipes in C: The Art of Scientific Computing , 1989 .
[5] Heekuck Oh,et al. Neural Networks for Pattern Recognition , 1993, Adv. Comput..
[6] Barak A. Pearlmutter. Fast Exact Multiplication by the Hessian , 1994, Neural Computation.
[7] Terrence J. Sejnowski,et al. Tempering Backpropagation Networks: Not All Weights are Created Equal , 1995, NIPS.
[8] Manfred K. Warmuth,et al. Additive versus exponentiated gradient updates for linear prediction , 1995, STOC '95.
[9] Todd K. Leen,et al. Using Curvature Information for Fast Stochastic Search , 1996, NIPS.
[10] Mark Harmon. Multi-player residual advantage learning with general function , 1996 .
[11] Nicol N. Schraudolph. Online Learning with Adaptive Local Step Sizes , 1999 .
[12] Nicol N. Schraudolph,et al. Local Gain Adaptation in Stochastic Gradient Descent , 1999 .
[13] Nicol N. Schraudolph,et al. Online Independent Component Analysis with Local Learning Rate Adaptation , 1999, NIPS.
[14] M. Rattray,et al. MATRIX MOMENTUM FOR PRACTICAL NATURAL GRADIENT LEARNING , 1999 .
[15] Gavin C. Cawley,et al. On a Fast, Compact Approximation of the Exponential Function , 2000, Neural Computation.
[16] S. Amari. Natural Gradient Works Eciently in Learning , 2022 .