Acceleration Techniques for the Backpropagation Algorithm

Like other gradient descent techniques, backpropagation converges slowly, even for medium sized network problems. This fact results from the usually large dimension of the weight space and from the particular shape of the error surface in each iteration point. Oscillation between the sides of deep and narrow valleys, for example, is a well known case where gradient descent provides poor convergence rates.