Almeida et al. have recently proposed online algorithms for local step size adaptation in nonlinear systems trained by gradient descent. Here we develop an alternative to their approach by extending Sutton’s work on linear systems to the general, nonlinear case. The resulting algorithms are computationally little more expensive than other acceleration techniques, do not assume statistical independence between successive training patterns, and do not require an arbitrary smoothing parameter. In our benchmark experiments, they consistently outperform other acceleration methods as well as stochastic gradient descent with fixed learning rate and momentum.
Frank Fallside,et al.
An adaptive training algorithm for back propagation networks
Nicol N. Schraudolph.
Online Local Gain Adaptation for Multi-Layer Perceptrons
J. van Leeuwen,et al.
Neural Networks: Tricks of the Trade
Lecture Notes in Computer Science.
Barak A. Pearlmutter,et al.
Automatic Learning Rate Maximization in Large Adaptive Machines
Sharad Singhal,et al.
Training Multilayer Perceptrons with the Extende Kalman Algorithm
Heinrich Braun,et al.
A direct adaptive method for faster backpropagation learning: the RPROP algorithm
IEEE International Conference on Neural Networks.
Nicol N. Schraudolph,et al.
A Fast, Compact Approximation of the Exponential Function
Manfred K. Warmuth,et al.
Additive versus exponentiated gradient updates for linear prediction
Tom Tollenaere,et al.
SuperSAB: Fast adaptive back propagation with good scaling properties
Barak A. Pearlmutter.
Fast Exact Multiplication by the Hessian
Terrence J. Sejnowski,et al.
Tempering Backpropagation Networks: Not All Weights are Created Equal
Luís B. Almeida,et al.
Speeding up Backpropagation
Richard S. Sutton,et al.
Adapting Bias by Gradient Descent: An Incremental Version of Delta-Bar-Delta
Thibault Langlois,et al.
Parameter adaptation in stochastic optimization
Robert A. Jacobs,et al.
Increased rates of convergence through learning rate adaptation
Ralph Neuneier,et al.
How to Train Neural Networks
Neural Networks: Tricks of the Trade.
William H. Press,et al.
Numerical recipes in C++: the art of scientific computing, 2nd Edition (C++ ed., print. is corrected to software version 2.10)
Yann LeCun,et al.
Improving the convergence of back-propagation learning with second-order methods
Andreas Ziehe,et al.
Adaptive On-line Learning in Changing Environments
F. A. Seiler,et al.
Numerical Recipes in C: The Art of Scientific Computing