Parameter adaptation in stochastic optimization

Abstract Optimization is an important operation in many domains of science and technology. Local optimization techniques typically employ some form of iterative procedure, based on derivatives of the function to be optimized (objective function). These techniques normally involve parameters that must be set by the user, often by trial and error. Those parameters can have a strong influence on the convergence speed of the optimization. In several cases, a significant speed advantage could be gained if one could vary these parameters during the optimization, to reflect the local characteristics of the function being optimized. Some parameter adaptation methods have been proposed for this purpose, for deterministic optimization situations. For stochastic (also called on-line) optimization situations, there appears to be no simple and effective parameter adaptation method. This paper proposes a new method for parameter adaptation in stochastic optimization. The method is applicable to a wide range of objective functions, as well as to a large set of local optimization techniques. We present the derivation of the method, details of its application to gradient descent and to some of its variants, and examples of its use in the gradient optimization of several functions, as well as in the training of a multilayer perceptron by on-line backpropagation. Introduction Optimization is an operation that is often used in several different domains of science and technology. It normally consists of maximizing or minimizing a given function (called objective function ), that is chosen to represent the quality of a given system. The system may be physical, (mechanical, chemical, etc.), a mathematical model, a computer program, etc., or even a mixture of several of these.