论文信息 - An Integrated Neuroevolutionary Approach to Reactive Control and High-Level Strategy

An Integrated Neuroevolutionary Approach to Reactive Control and High-Level Strategy

One promising approach to general-purpose artificial intelligence is neuroevolution, which has worked well on a number of problems from resource optimization to robot control. However, state-of-the-art neuroevolution algorithms like neuroevolution of augmenting topologies (NEAT) have surprising difficulty on problems that are fractured, i.e., where the desired actions change abruptly and frequently. Previous work demonstrated that bias and constraint (e.g., RBF-NEAT and Cascade-NEAT algorithms) can improve learning significantly on such problems. However, experiments in this paper show that relatively unrestricted algorithms (e.g., NEAT) still yield the best performance on problems requiring reactive control. Ideally, a single algorithm would be able to perform well on both fractured and unfractured problems. This paper introduces such an algorithm called SNAP-NEAT that uses adaptive operator selection to integrate strengths of NEAT, RBF-NEAT, and Cascade-NEAT. SNAP-NEAT is evaluated empirically on a set of problems ranging from reactive control to high-level strategy. The results show that SNAP-NEAT can adapt intelligently to the type of problem that it faces, thus laying the groundwork for learning algorithms that can be applied to a wide variety of problems.

Risto Miikkulainen | Nate Kohl | Nate Kohl | R. Miikkulainen

[1] A. Kolmogorov. Three approaches to the quantitative definition of information , 1968 .

[2] Vladimir Vapnik,et al. Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[3] G. Chaitin. A Theory of Program Size Formally Identical to Information Theory , 1975, JACM.

[4] Jan M. Maciejowski,et al. Model discrimination using an algorithmic information criterion , 1979, Autom..

[5] M. O'Mahony. Sensory Evaluation of Food: Statistical Methods and Procedures , 1986 .

[6] David E. Goldberg,et al. Genetic Algorithms with Sharing for Multimodalfunction Optimization , 1987, ICGA.

[7] John Moody,et al. Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[8] Lawrence Davis,et al. Adapting Operator Probabilities in Genetic Algorithms , 1989, ICGA.

[9] Christian Lebiere,et al. The Cascade-Correlation Learning Architecture , 1989, NIPS.

[10] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[11] Jooyoung Park,et al. Universal Approximation Using Radial-Basis-Function Networks , 1991, Neural Computation.

[12] John C. Platt. A Resource-Allocating Network for Function Interpolation , 1991, Neural Computation.

[13] A. P. Wieland,et al. Evolving neural network controllers for unstable systems , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[14] Bernhard E. Boser,et al. A training algorithm for optimal margin classifiers , 1992, COLT '92.

[15] Leslie Pack Kaelbling,et al. Learning in embedded systems , 1993 .

[16] Ming Li,et al. An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[17] Bryant A. Julstrom,et al. What Have You Done for Me Lately? Adapting Operator Probabilities in a Steady-State Genetic Algorithm , 1995, ICGA.

[18] David B. Fogel,et al. Evolving Neural Control Systems , 1995, IEEE Expert.

[19] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.

[20] Larry D. Pyeatt,et al. A comparison between cellular encoding and direct encoding for genetic neural networks , 1996 .

[21] Charles W. Anderson,et al. Comparison of CMACs and radial basis functions for local function approximators in reinforcement learning , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[22] Risto Miikkulainen,et al. Incremental Evolution of Complex General Behavior , 1997, Adapt. Behav..

[23] Todd Peterson,et al. An RBF network alternative for a hybrid architecture , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[24] Peter Ross,et al. Adapting Operator Settings in Genetic Algorithms , 1998, Evolutionary Computation.

[25] Jorma Rissanen,et al. The Minimum Description Length Principle in Coding and Modeling , 1998, IEEE Trans. Inf. Theory.

[26] X. Yao. Evolving Artificial Neural Networks , 1999 .

[27] Risto Miikkulainen,et al. Solving Non-Markovian Control Tasks with Neuro-Evolution , 1999, IJCAI.

[28] Kenneth A. De Jong,et al. Cooperative Coevolution: An Architecture for Evolving Coadapted Subcomponents , 2000, Evolutionary Computation.

[29] H. Barbosa. On Adaptive Operator Probabilities in Real Coded Genetic Algorithms , 2000 .

[30] Hans-Martin Gutmann,et al. A Radial Basis Function Method for Global Optimization , 2001, J. Glob. Optim..

[31] Nikolaus Hansen,et al. Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.

[32] Risto Miikkulainen,et al. Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[33] Christian Igel,et al. Neuroevolution for reinforcement learning using evolution strategies , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[34] Risto Miikkulainen,et al. Evolving Keepaway Soccer Players through Task Decomposition , 2003, GECCO.

[35] Risto Miikkulainen,et al. Robust non-linear control through neuroevolution , 2003 .

[36] Peter Dayan,et al. Q-learning , 1992, Machine Learning.

[37] L. D. Whitley,et al. Genetic Reinforcement Learning for Neurocontrol Problems , 2004, Machine Learning.

[38] Risto Miikkulainen,et al. Competitive Coevolution through Evolutionary Complexification , 2011, J. Artif. Intell. Res..

[39] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.

[40] David E. Goldberg,et al. Probability matching, the magnitude of reinforcement, and classifier system bidding , 2004, Machine Learning.

[41] Risto Miikkulainen,et al. Efficient Reinforcement Learning through Symbiotic Evolution , 2004 .

[42] Risto Miikkulainen,et al. Efficient evolution of neural networks through complexification , 2004 .

[43] Risto Miikkulainen,et al. Evolving a Roving Eye for Go , 2004, GECCO.

[44] Li Jun,et al. Q-Learning with a growing RBF network for behavior learning in mobile robotics , 2005 .

[45] Dirk Thierens,et al. An Adaptive Pursuit Strategy for Allocating Operator Probabilities , 2005, BNAIC.

[46] Peter Stone,et al. Reinforcement Learning for RoboCup Soccer Keepaway , 2005, Adapt. Behav..

[47] Risto Miikkulainen,et al. Real-time neuroevolution in the NERO video game , 2005, IEEE Transactions on Evolutionary Computation.

[48] Risto Miikkulainen,et al. Neuroevolution of an automobile crash warning system , 2005, GECCO '05.

[49] Risto Miikkulainen,et al. Evolving Soccer Keepaway Players Through Task Decomposition , 2005, Machine Learning.

[50] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[51] Peter Stone,et al. Keepaway Soccer: From Machine Learning Testbed to Benchmark , 2005, RoboCup.

[52] Cândida Ferreira,et al. Gene Expression Programming: Mathematical Modeling by an Artificial Intelligence , 2014, Studies in Computational Intelligence.

[53] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[54] Shimon Whiteson,et al. Comparing evolutionary and temporal difference methods in a reinforcement learning domain , 2006, GECCO.

[55] Matthew Taylor and Shimon Whiteson and Peter Stone,et al. Comparing Evolutionary and Temporal Difference Methods for Reinforcement Learning , 2006 .

[56] Risto Miikkulainen,et al. Efficient Non-linear Control Through Neuroevolution , 2006, ECML.

[57] Risto Miikkulainen,et al. Evolving a real-world vehicle warning system , 2006, GECCO.

[58] Peter Stone,et al. Half Field Offense in RoboCup Soccer: A Multiagent Reinforcement Learning Case Study , 2006, RoboCup.

[59] Candida Ferreira. Gene expression programming , 2006 .

[60] Jun Li,et al. Q-RAN: A Constructive Reinforcement Learning Approach for Robot Behavior Learning , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[61] Yoshua Bengio,et al. Scaling learning algorithms towards AI , 2007 .

[62] Risto Miikkulainen,et al. Coevolving Strategies for General Game Playing , 2007, 2007 IEEE Symposium on Computational Intelligence and Games.

[63] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[64] Risto Miikkulainen,et al. Accelerated Neural Evolution through Cooperatively Coevolved Synapses , 2008, J. Mach. Learn. Res..

[65] Risto Miikkulainen,et al. Evolving neural networks for fractured domains , 2008, GECCO '08.

[66] Michèle Sebag,et al. Adaptive operator selection with dynamic multi-armed bandits , 2008, GECCO '08.

[67] Risto Miikkulainen,et al. Evolving neural networks for strategic decision-making problems , 2009, Neural Networks.

[68] Risto Miikkulainen,et al. Learning in fractured problems with constructive neural network algorithms , 2009 .