Abstract Use of the logistic derivative in backward error propagation suggests one source of ill-conditioning to be the decreasing multiplier in the computation of the elements of the gradient at each layer. A compensatory rescaling is suggested, based heuristically upon the expected value of the multiplier. Experimental results demonstrate an order of magnitude improvement in convergence.
 Robert A. Jacobs,et al. Increased rates of convergence through learning rate adaptation , 1987, Neural Networks.