Genetic Reinforcement Learning for Neurocontrol Problems

Empirical tests indicate that at least one class of genetic algorithms yields good performance for neural network weight optimization in terms of learning rates and scalability. The successful application of these genetic algorithms to supervised learning problems sets the stage for the use of genetic algorithms in reinforcement learning problems. On a simulated inverted-pendulum control problem, “genetic reinforcement learning” produces competitive results with AHC, another well-known reinforcement learning paradigm for neural networks that employs the temporal difference method. These algorithms are compared in terms of learning rates, performance-based generalization, and control behavior over time.

[1]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[2]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[3]  Richard S. Sutton,et al.  Training and Tracking in Robotics , 1985, IJCAI.

[4]  John H. Holland,et al.  Escaping brittleness: the possibilities of general-purpose learning algorithms applied to parallel rule-based systems , 1995 .

[5]  Charles W. Anderson,et al.  Strategy Learning with Multilayer Connectionist Representations , 1987 .

[6]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[7]  Darrell Whitley,et al.  Genitor: a different genetic algorithm , 1988 .

[8]  Lawrence Davis,et al.  Training Feedforward Neural Networks Using Genetic Algorithms , 1989, IJCAI.

[9]  Michael O. Odetayo,et al.  Genetic Algorithm for Inducing Control Rules for a Dynamic System , 1989, International Conference on Genetic Algorithms.

[10]  D.E. Goldberg,et al.  Classifier Systems and Genetic Algorithms , 1989, Artif. Intell..

[11]  C.W. Anderson,et al.  Learning to control an inverted pendulum using neural networks , 1989, IEEE Control Systems Magazine.

[12]  P. J. Werbos,et al.  Backpropagation and neurocontrol: a review and prospectus , 1989, International 1989 Joint Conference on Neural Networks.

[13]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.

[14]  L. Darrell Whitley,et al.  Optimizing Neural Networks Using FasterMore Accurate Genetic Search , 1989, ICGA.

[15]  David H. Ackley,et al.  Generalization and Scaling in Reinforcement Learning , 1989, NIPS.

[16]  Tariq Samad,et al.  Designing Application-Specific Neural Networks Using the Genetic Algorithm , 1989, NIPS.

[17]  John J. Grefenstette,et al.  A System for Learning Control Strategies with Genetic Algorithms , 1989, ICGA.

[18]  Peter M. Todd,et al.  Designing Neural Networks using Genetic Algorithms , 1989, ICGA.

[19]  L. Darrell Whitley,et al.  Genetic algorithms and neural networks: optimizing connections and connectivity , 1990, Parallel Comput..

[20]  Lawrence. Davis,et al.  Handbook Of Genetic Algorithms , 1990 .

[21]  Andrew G. Barto,et al.  Connectionist learning for control , 1990 .

[22]  Claude Sammut,et al.  Is Learning Rate a Good Performance Criterion for Learning? , 1990, ML Workshop.

[23]  Larry J. Eshelman,et al.  Using genetic search to exploit the emergent behavior of neural networks , 1990 .

[24]  W. Thomas Miller,et al.  A challenging set of control problems , 1990 .

[25]  Darrell Whitley,et al.  Optimizing small neural networks using a distributed genetic algorithm , 1990 .

[26]  Dirk Thierens,et al.  A Topology Exploiting Genetic Algorithm to Control Dynamic Systems , 1990, PPSN.

[27]  L. Darrell Whitley,et al.  Genetic Reinforcement Learning with Multilayer Neural Networks , 1991, ICGA.

[28]  Jean-Arcady Meyer,et al.  Reinforcement Learning Architectures for Animats , 1991 .

[29]  David H. Ackley,et al.  Interactions between learning and evolution , 1991 .

[30]  A. P. Wieland,et al.  Evolving neural network controllers for unstable systems , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[31]  Richard S. Sutton,et al.  Reinforcement learning architectures for animats , 1991 .

[32]  Jocelyn Sietsma,et al.  Creating artificial neural networks that generalize , 1991, Neural Networks.

[33]  Richard S. Sutton,et al.  A Menu of Designs for Reinforcement Learning Over Time , 1995 .

[34]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[35]  John J. Grefenstette,et al.  Learning sequential decision rules using simulation models and competition , 2004, Machine Learning.

[36]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[37]  J. Fitzpatrick,et al.  Genetic Algorithms in Noisy Environments , 2005, Machine Learning.