Evolutionary Algorithms for Reinforcement Learning

There are two distinct approaches to solving reinforcement learning problems, namely, searching in value function space and searching in policy space. Temporal difference methods and evolutionary algorithms are well-known examples of these approaches. Kaelbling, Littman and Moore recently provided an informative survey of temporal difference methods. This article focuses on the application of evolutionary algorithms to the reinforcement learning problem, emphasizing alternative policy representations, credit assignment methods, and problem-specific genetic operators. Strengths and weaknesses of the evolutionary approach to reinforcement learning are presented, along with a survey of representative applications.

[1]  Lawrence J. Fogel,et al.  Artificial Intelligence through Simulated Evolution , 1966 .

[2]  K. Dejong,et al.  An analysis of the behavior of a class of genetic adaptive systems , 1975 .

[3]  John Holland,et al.  Adaptation in Natural and Artificial Sys-tems: An Introductory Analysis with Applications to Biology , 1975 .

[4]  John H. Holland,et al.  Cognitive systems based on adaptive algorithms , 1977, SGAR.

[5]  John H. Holland,et al.  COGNITIVE SYSTEMS BASED ON ADAPTIVE ALGORITHMS1 , 1978 .

[6]  Stephen F. Smith,et al.  Flexible Learning of Problem Solving Heuristics Through Adaptive Search , 1983, IJCAI.

[7]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[8]  J. David Schaffer,et al.  Multi-Objective Learning via Genetic Algorithms , 1985, IJCAI.

[9]  John H. Holland,et al.  Escaping brittleness: the possibilities of general-purpose learning algorithms applied to parallel rule-based systems , 1995 .

[10]  John J. Grefenstette,et al.  Optimization of Control Parameters for Genetic Algorithms , 1986, IEEE Transactions on Systems, Man, and Cybernetics.

[11]  David E. Goldberg,et al.  Genetic Algorithms with Sharing for Multimodalfunction Optimization , 1987, ICGA.

[12]  J. Holland Genetic Algorithms and Classifier Systems: Foundations and Future Directions , 1987, ICGA.

[13]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[14]  Darrell Whitley,et al.  Genitor: a different genetic algorithm , 1988 .

[15]  Rajarshi Das,et al.  A Study of Control Parameters Affecting Online Performance of Genetic Algorithms for Function Optimization , 1989, ICGA.

[16]  L. Darrell Whitley,et al.  The GENITOR Algorithm and Selection Pressure: Why Rank-Based Allocation of Reproductive Trials is Best , 1989, ICGA.

[17]  C.W. Anderson,et al.  Learning to control an inverted pendulum using neural networks , 1989, IEEE Control Systems Magazine.

[18]  A. Barto,et al.  Learning and Sequential Decision Making , 1989 .

[19]  Kai-Fu Lee,et al.  The Development of a World Class Othello Program , 1990, Artif. Intell..

[20]  Richard K. Belew,et al.  Evolving networks: using the genetic algorithm with connectionist learning , 1990 .

[21]  Richard S. Sutton,et al.  Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[22]  David R. Jefferson,et al.  Selection in Massively Parallel Genetic Algorithms , 1991, ICGA.

[23]  John J. Grefenstette,et al.  Using a Genetic Algorithm to Learn Behaviors for Autonomous Vehicles , 1992 .

[24]  John J. Grefenstette,et al.  An Approach to Anytime Learning , 1992, ML.

[25]  Lonnie Chrisman,et al.  Reinforcement Learning with Perceptual Aliasing: The Perceptual Distinctions Approach , 1992, AAAI.

[26]  John J. Grefenstette,et al.  Genetic Algorithms for Changing Environments , 1992, PPSN.

[27]  Long Lin,et al.  Memory Approaches to Reinforcement Learning in Non-Markovian Domains , 1992 .

[28]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[29]  Inman Harvey,et al.  Explorations in Evolutionary Robotics , 1993, Adapt. Behav..

[30]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[31]  John J. Grefenstette,et al.  Genetic Algorithms for Tracking Changing Environments , 1993, ICGA.

[32]  Risto Miikkulainen,et al.  Evolving Neural Networks to Focus Minimax Search , 1994, AAAI.

[33]  Randall D. Beer,et al.  Sequential Behavior and Learning in Evolved Dynamical Neural Networks , 1994, Adapt. Behav..

[34]  Alan C. Schultz,et al.  LEARNING ROBOT BEHAVIORS USING GENETIC ALGORITHMS , 1994 .

[35]  C. Fiechter Eecient Reinforcement Learning , 1994 .

[36]  Alden H. Wright,et al.  Simple Genetic Algorithms with Linear Fitness , 1994, Evolutionary Computation.

[37]  Stewart W. Wilson ZCS: A Zeroth Level Classifier System , 1994, Evolutionary Computation.

[38]  Mark B. Ring Continual learning in reinforcement environments , 1995, GMD-Bericht.

[39]  John J. Grefenstette,et al.  A Coevolutionary Approach to Learning Sequential Decision Rules , 1995, ICGA.

[40]  John J. Grefenstette,et al.  ROBOT LEARNING WITH PARALLEL GENETIC ALGORITHMS ON NETWORKED COMPUTERS , 1995 .

[41]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[42]  Inman Harvey,et al.  Circle in the round: State space attractors for evolved sighted robots , 1995, Robotics Auton. Syst..

[43]  TasksDavid E. Moriarty Learning Sequential Decision , 1995 .

[44]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[45]  Risto Miikkulainen,et al.  Evolving Obstacle Avoidance Behavior in a Robot Arm , 1996 .

[46]  Andrew McCallum,et al.  Reinforcement learning with selective perception and hidden state , 1996 .

[47]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[48]  Risto Miikkulainen,et al.  Efficient Reinforcement Learning through Symbiotic Evolution , 1996, Machine Learning.

[49]  John J. Grefenstette,et al.  Rank-based selection , 2018, Evolutionary Computation 1.

[50]  Marco Colombetti,et al.  Robot Shaping: An Experiment in Behavior Engineering , 1997 .

[51]  Risto Miikkulainen,et al.  Forming Neural Networks Through Efficient and Adaptive Coevolution , 1997, Evolutionary Computation.

[52]  Mitchell A. Potter,et al.  The design and analysis of a computational model of cooperative coevolution , 1997 .

[53]  John J. Grefenstette,et al.  Proportional selection and sampling algorithms , 1997 .

[54]  Johannes P. Ros,et al.  Probably approximately correct (PAC) learning analysis , 1997 .

[55]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .