### Accelerating the convergence of the back-propagation method

The utility of the back-propagation method in establishing suitable weights in a distributed adaptive network has been demonstrated repeatedly. Unfortunately, in many applications, the number of iterations required before convergence can be large. Modifications to the back-propagation algorithm described by Rumelhart et al. (1986) can greatly accelerate convergence. The modifications consist of three changes:1) instead of updating the network weights after each pattern is presented to the network, the network is updated only after the entire repertoire of patterns to be learned has been presented to the network, at which time the algebraic sums of all the weight changes are applied:2) instead of keeping η, the “learning rate” (i.e., the multiplier on the step size) constant, it is varied dynamically so that the algorithm utilizes a near-optimum η, as determined by the local optimization topography; and3) the momentum factor α is set to zero when, as signified by a failure of a step to reduce the total error, the information inherent in prior steps is more likely to be misleading than beneficial. Only after the network takes a useful step, i.e., one that reduces the total error, does α again assume a non-zero value. Considering the selection of weights in neural nets as a problem in classical nonlinear optimization theory, the rationale for algorithms seeking only those weights that produce the globally minimum error is reviewed and rejected.