Simulated annealing of neural networks: The 'cooling' strategy reconsidered

The simulated annealing (SA) algorithm has been widely used to addressed intractable global optimizations in many fields, including training of artificial neural networks. Implementations of annealing universally always use a monotone decreasing, or cooling, temperature schedule which is motivated by the algorithm's proof of optimality as well as analogies with statistical thermodynamics. The authors challenge this motivation: the fact that cooling schedules are optimal in theory is not related to the practical performance of the algorithm. The finding is based on a new best-so-far criterion for measuring the quality of annealing schedules. Motivated by studies of optimal schedules for small problems, highly nonstandard annealing schedules are studied for training of feedforward perceptron networks on a real-world sensor classification benchmark. Clear evidence is found that optimal schedules do not necessarily decrease monotonically to zero.<<ETX>>

[1]  Andrew B. Kahng,et al.  Exploiting fractalness of error surfaces: New methods for neural network learning , 1992, [Proceedings] 1992 IEEE International Symposium on Circuits and Systems.

[2]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[3]  Shirley Dex,et al.  JR 旅客販売総合システム(マルス)における運用及び管理について , 1991 .

[4]  Lloyd S. Riggs,et al.  Research with the Waveguide Beyond Cutoff or Separated Aperture Dielectric Anomaly Detection Scheme , 1990 .

[5]  N. Biggs THE TRAVELING SALESMAN PROBLEM A Guided Tour of Combinatorial Optimization , 1986 .

[6]  Alistair I. Mees,et al.  Convergence of an annealing algorithm , 1986, Math. Program..

[7]  Carsten Peterson,et al.  A Mean Field Theory Learning Algorithm for Neural Networks , 1987, Complex Syst..

[8]  Geoffrey E. Hinton Deterministic Boltzmann Learning Performs Steepest Descent in Weight-Space , 1989, Neural Computation.

[9]  B. W. Lee,et al.  Electronic neural circuits with simulated annealing , 1989, International 1989 Joint Conference on Neural Networks.

[10]  D. F. Wong,et al.  Simulated Annealing for VLSI Design , 1988 .

[11]  Bruce E. Hajek,et al.  Cooling Schedules for Optimal Annealing , 1988, Math. Oper. Res..

[12]  Etienne Barnard,et al.  Optimization for training neural nets , 1992, IEEE Trans. Neural Networks.

[13]  David E. van den Bout,et al.  Graph partitioning using annealed neural networks , 1990, International 1989 Joint Conference on Neural Networks.

[14]  V. Cerný Thermodynamical approach to the traveling salesman problem: An efficient simulation algorithm , 1985 .

[15]  Emile H. L. Aarts,et al.  Simulated annealing and Boltzmann machines - a stochastic approach to combinatorial optimization and neural computing , 1990, Wiley-Interscience series in discrete mathematics and optimization.

[16]  John E. Dennis,et al.  Numerical methods for unconstrained optimization and nonlinear equations , 1983, Prentice Hall series in computational mathematics.

[17]  Andrew B. Kahng Random structure of error surfaces: toward new stochastic learning methods , 1992, Defense, Security, and Sensing.

[18]  Emile H. L. Aarts,et al.  Simulated Annealing: Theory and Applications , 1987, Mathematics and Its Applications.

[19]  Wesley E. Snyder,et al.  Mean field annealing: a formalism for constructing GNC-like algorithms , 1992, IEEE Trans. Neural Networks.

[20]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.