Training neural nets with the reactive tabu search

In this paper the task of training subsymbolic systems is considered as a combinatorial optimization problem and solved with the heuristic scheme of the reactive tabu search (RTS). An iterative optimization process based on a "modified local search" component is complemented with a meta-strategy to realize a discrete dynamical system that discourages limit cycles and the confinement of the search trajectory in a limited portion of the search space. The possible cycles are discouraged by prohibiting (i.e., making tabu) the execution of moves that reverse the ones applied in the most recent part of the search. The prohibition period is adapted in an automated way. The confinement is avoided and a proper exploration is obtained by activating a diversification strategy when too many configurations are repeated excessively often. The RTS method is applicable to nondifferentiable functions, is robust with respect to the random initialization, and effective in continuing the search after local minima. Three tests of the technique on feedforward and feedback systems are presented.

[1]  R. Battiti,et al.  Simulated annealing and Tabu search in the long run: A comparison on QAP tasks☆ , 1994 .

[2]  Richard S. Sutton,et al.  Challenging Control Problems , 1995 .

[3]  Robert E. Jenkins,et al.  A simplified neural network solution through problem decomposition: the case of the truck backer-upper , 1993, IEEE Trans. Neural Networks.

[4]  Ronald L. Rivest,et al.  Training a 3-node neural network is NP-complete , 1988, COLT '88.

[5]  Roger J.-B. Wets,et al.  Minimization by Random Search Techniques , 1981, Math. Oper. Res..

[6]  Fred W. Glover,et al.  Tabu Search - Part I , 1989, INFORMS J. Comput..

[7]  N. Ahmed,et al.  FAST TRANSFORMS, algorithms, analysis, applications , 1983, Proceedings of the IEEE.

[8]  Roberto Battiti,et al.  First- and Second-Order Methods for Learning: Between Steepest Descent and Newton's Method , 1992, Neural Computation.

[9]  Zhi-Quan Luo,et al.  On the Convergence of the LMS Algorithm with Adaptive Learning Rate for Linear Feedforward Networks , 1991, Neural Computation.

[10]  F. Glover Scatter search and star-paths: beyond the genetic metaphor , 1995 .

[11]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[12]  F. Glover HEURISTICS FOR INTEGER PROGRAMMING USING SURROGATE CONSTRAINTS , 1977 .

[13]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[14]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[15]  L. Darrell Whitley,et al.  Genetic algorithms and neural networks: optimizing connections and connectivity , 1990, Parallel Comput..

[16]  Alfred V. Aho,et al.  Data Structures and Algorithms , 1983 .

[17]  Martin Fodslette Møller,et al.  Supervised Learning On Large Redundant Training Sets , 1993, Int. J. Neural Syst..

[18]  E. K. Blum,et al.  Approximation of Boolean Functions by Sigmoidal Networks: Part I: XOR and Other Two-Variable Functions , 1989, Neural Computation.

[19]  Francesco Piazza,et al.  Fast neural networks without multipliers , 1993, IEEE Trans. Neural Networks.

[20]  Roberto Battiti,et al.  Learning with first, second, and no derivatives: A case study in high energy physics , 1994, Neurocomputing.

[21]  Marwan A. Jabri,et al.  Weight Perturbation: An Optimal Architecture and Learning Technique for Analog VLSI Feedforward and Recurrent Multilayer Networks , 1991, Neural Comput..

[22]  Roberto Battiti,et al.  Democracy in neural nets: Voting schemes for classification , 1994, Neural Networks.

[23]  R. Battiti,et al.  TOTEM: a digital processor for neural networks and Reactive Tabu Search , 1994, Proceedings of the Fourth International Conference on Microelectronics for Neural Networks and Fuzzy Systems.

[24]  Eduardo D. Sontag,et al.  Neural Networks for Control , 1993 .

[25]  A. Nijenhuis Combinatorial algorithms , 1975 .

[26]  R. Brunelli,et al.  Stochastic minimization with adaptive memory , 1995 .

[27]  Heinz Mühlenbein,et al.  The parallel genetic algorithm as function optimizer , 1991, Parallel Comput..

[28]  B. Widrow,et al.  The truck backer-upper: an example of self-learning in neural networks , 1989, International 1989 Joint Conference on Neural Networks.

[29]  R. Battiti,et al.  Local search with memory: benchmarking RTS , 1995 .

[30]  Jonathan Engel,et al.  Teaching Feed-Forward Neural Networks by Simulated Annealing , 1988, Complex Syst..

[31]  Andrew H. Gee,et al.  Polyhedral Combinatorics and Neural Networks , 1994, Neural Computation.

[32]  P. Strevens Iii , 1985 .

[33]  Alberto Tesi,et al.  On the Problem of Local Minima in Backpropagation , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[34]  M. R. Rao,et al.  Combinatorial Optimization , 1992, NATO ASI Series.

[35]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[36]  Roberto Battiti,et al.  The Reactive Tabu Search , 1994, INFORMS J. Comput..

[37]  Alan F. Murray Multilayer Perceptron Learning Optimized for On-Chip Implementation: A Noise-Robust System , 1992, Neural Computation.

[38]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[39]  D. A. Beyer,et al.  Tabu learning: a neural network search method for solving nonconvex optimization problems , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[40]  George Cybenko,et al.  Ill-Conditioning in Neural Network Training Problems , 1993, SIAM J. Sci. Comput..

[41]  R. Odorico,et al.  COJETS 6.23: a Monte Carlo simulation program for p-p, p-p collisions and e+e- annihilation , 1992 .

[42]  Scott Kirkpatrick,et al.  Optimization by Simmulated Annealing , 1983, Sci..

[43]  R. Odorico,et al.  COJETS: A Monte Carlo program simulating QCD in hadronic production of jets and heavy flavours with inclusion of initial QCD Bremsstrahlung , 1984 .

[44]  John F. Kolen,et al.  Backpropagation is Sensitive to Initial Conditions , 1990, Complex Syst..

[45]  Robert E. Jenkins,et al.  A Simplified Neural-Network Solution through Problem Decomposition: The Case of the Truck Backer-Upper , 1992, Neural Computation.

[46]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[47]  Bart Kosko,et al.  Adaptive fuzzy systems for backing up a truck-and-trailer , 1992, IEEE Trans. Neural Networks.

[48]  Fred Glover,et al.  Tabu Search - Part II , 1989, INFORMS J. Comput..

[49]  Charles M. Rader,et al.  Fast transforms: Algorithms, analyses, applications , 1984 .

[50]  Roberto Battiti,et al.  Parallel biased search for combinatorial optimization: genetic algorithms and TABU , 1992, Microprocess. Microsystems.

[51]  Terrence J. Sejnowski,et al.  Analysis of hidden units in a layered network trained to classify sonar targets , 1988, Neural Networks.

[52]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[53]  S. Wolfram Computation theory of cellular automata , 1984 .

[54]  D. Werra,et al.  Tabu search: a tutorial and an application to neural networks , 1989 .