Global random optimization by simultaneous perturbation stochastic approximation

A desire with iterative optimization techniques is that the algorithm reach the global optimum rather than get stranded at a local optimum value. In this paper, we examine the theoretical and numerical global convergence properties of a certain "gradient free" stochastic approximation algorithm called "SPSA," that has performed well in complex optimization problems. We establish two theorems on the global convergence of SPSA. The first provides conditions under which SPSA will converge in probability to a global optimum using the well-known method of injected noise. The injected noise prevents the algorithm from converging prematurely to a local optimum point. In the second theorem, we show that, under different conditions, "basic" SPSA without injected noise can achieve convergence in probability to a global optimum. This occurs because of the noise effectively (and automatically) introduced into the algorithm by the special form of the SPSA gradient approximation. This global convergence without injected noise can have important benefits in the setup (tuning) and performance (rate of convergence) of the algorithm. The discussion is supported by numerical studies showing favorable comparisons of SPSA to simulated annealing and genetic algorithms.

[1]  Pierre L'Ecuyer,et al.  Global Stochastic Optimization with Low-Dispersion Point Sets , 1998, Oper. Res..

[2]  Efficient global optimization using SPSA , 1999, Proceedings of the 1999 American Control Conference (Cat. No. 99CH36251).

[3]  S. Andradóttir,et al.  A Simulated Annealing Algorithm with Constant Temperature for Discrete Stochastic Optimization , 1999 .

[4]  Gang George Yin Rates of Convergence for a Class of Global Stochastic Optimization Algorithms , 1999, SIAM J. Optim..

[5]  J. Spall Adaptive stochastic approximation by the simultaneous perturbation method , 1998, Proceedings of the 37th IEEE Conference on Decision and Control (Cat. No.98CH36171).

[6]  J. Spall Implementation of the simultaneous perturbation algorithm for stochastic optimization , 1998 .

[7]  Hai-Tao Fang,et al.  ANNEALING OF ITERATIVE STOCHASTIC SCHEMES , 1997 .

[8]  J. Dippon,et al.  Weighted Means in Stochastic Approximation of Minima , 1997 .

[9]  D. C. Chin,et al.  Comparative study of stochastic algorithms for system optimization based on gradient approximations , 1997, IEEE Trans. Syst. Man Cybern. Part B.

[10]  Melanie Mitchell,et al.  An introduction to genetic algorithms , 1996 .

[11]  Stochastic approximation of global minimum points , 1994 .

[12]  D. C. Chin,et al.  A more efficient global optimization algorithm based on Styblinski and Tang , 1994, Neural Networks.

[13]  S. Yakowitz A globally convergent stochastic approximation , 1993 .

[14]  S. Mitter,et al.  Metropolis-type annealing algorithms for global optimization in R d , 1993 .

[15]  J. Spall Multivariate stochastic approximation using a simultaneous perturbation gradient approximation , 1992 .

[16]  S. Mitter,et al.  Recursive stochastic algorithms for global optimization in R d , 1991 .

[17]  M. A. Styblinski,et al.  Experiments in nonconvex optimization: Stochastic approximation with function smoothing and simulated annealing , 1990, Neural Networks.

[18]  Bruce E. Hajek,et al.  Cooling Schedules for Optimal Annealing , 1988, Math. Oper. Res..

[19]  C. Hwang,et al.  Diffusion for global optimization in R n , 1987 .

[20]  H. Kushner Asymptotic global behavior for stochastic approximation and diffusions with slowly decreasing noise effects: Global minimization via Monte Carlo , 1987 .

[21]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.