论文信息 - A Neuroevolution Approach to General

A Neuroevolution Approach to General

This paper addresses the challenge of learning to play many different video games with little domain-specific knowledge. Specifically, it introduces a neuroevolution approach to general Atari 2600 game playing. Four neuroevolution algo- rithms were paired with three different state representations and evaluated on a set of 61 Atari games. The neuroevolution agents represent different points along the spectrum of algorithmic sophistication—including weight evolution on topologically fixed neural networks (conventional neuroevolution), covariance ma- trix adaptation evolution strategy (CMA-ES), neuroevolution of augmenting topologies (NEAT), and indirect network encoding (HyperNEAT). State representations include an object represen- tation of the game screen, the raw pixels of the game screen, and seeded noise (a comparative baseline). Results indicate that di- rect-encoding methods work best on compact state representations while indirect-encoding methods (i.e., HyperNEAT) allow scaling to higher dimensional representations (i.e., the raw game screen). Previous approaches based on temporal-difference (TD) learning had trouble dealing with the large state spaces and sparse reward gradients often found in Atari games. Neuroevolution ameliorates these problems and evolved policies achieve state-of-the-art re- sults, even surpassing human high scores on three games. These results suggest that neuroevolution is a promising approach to general video game playing (GVGP).

Risto Miikkulainen | Peter Stone | Joel Lehman | Matthew Hausknecht

[1] Josh Bongard,et al. Morphological change in machines accelerates the evolution of robust behavior , 2011, Proceedings of the National Academy of Sciences.

[2] Andrea Lockerd Thomaz,et al. Automatic State Abstraction from Demonstration , 2011, IJCAI.

[3] Daniel Urieli,et al. Design and Optimization of an Omnidirectional Humanoid Walk: A Winning Approach at the RoboCup 2011 3D Simulation Competition , 2012, AAAI.

[4] Charles Ofria,et al. Evolving coordinated quadruped gaits with the HyperNEAT generative encoding , 2009, 2009 IEEE Congress on Evolutionary Computation.

[5] Shane Legg,et al. An Approximation of the Universal Intelligence Measure , 2011, Algorithmic Probability and Friends.

[6] Kenneth O. Stanley,et al. A Case Study on the Critical Role of Geometric Regularity in Machine Learning , 2008, AAAI.

[7] Marc'Aurelio Ranzato,et al. Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8] Samuel Wintermute,et al. Using Imagery to Simplify Perceptual Abstraction in Reinforcement Learning Agents , 2010, AAAI.

[9] Geoffrey E. Hinton,et al. Generating Text with Recurrent Neural Networks , 2011, ICML.

[10] Kenneth O. Stanley,et al. A Hypercube-Based Encoding for Evolving Large-Scale Neural Networks , 2009, Artificial Life.

[11] Dario Floreano,et al. Neuroevolution: from architectures to learning , 2008, Evol. Intell..

[12] Risto Miikkulainen,et al. Evolving a Roving Eye for Go , 2004, GECCO.

[13] Nikolaus Hansen,et al. The CMA Evolution Strategy: A Comparing Review , 2006, Towards a New Evolutionary Computation.

[14] Marc G. Bellemare,et al. Sketch-Based Linear Value Function Approximation , 2012, NIPS.

[15] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.

[16] Risto Miikkulainen,et al. HyperNEAT-GGP: a hyperNEAT-based atari general game player , 2012, GECCO '12.

[17] Yavar Naddaf,et al. Game-independent AI agents for playing Atari 2600 console games , 2010 .

[18] Peter Stone,et al. TEXPLORE: real-time sample-efficient reinforcement learning for robots , 2012, Machine Learning.

[19] Dr. Marcus Hutter,et al. Universal artificial intelligence , 2004 .

[20] Jordan B. Pollack,et al. Automatic design and manufacture of robotic lifeforms , 2000, Nature.

[21] Michael R. Genesereth,et al. General Game Playing: Overview of the AAAI Competition , 2005, AI Mag..

[22] Kenneth O. Stanley,et al. On the Performance of Indirect Encoding Across the Continuum of Regularity , 2011, IEEE Transactions on Evolutionary Computation.

[23] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[24] Marc G. Bellemare,et al. Investigating Contingency Awareness Using Atari 2600 Games , 2012, AAAI.

[25] Kenneth O. Stanley,et al. Generative encoding for multiagent learning , 2008, GECCO '08.

[26] Simon M. Lucas,et al. Ms Pac-Man competition , 2007, SEVO.

[27] Julian Togelius,et al. Super mario evolution , 2009, 2009 IEEE Symposium on Computational Intelligence and Games.

[28] Kenneth O. Stanley,et al. Novelty Search and the Problem with Objectives , 2011 .

[29] Tom Schaul,et al. A video game description language for model-based or interactive learning , 2013, 2013 IEEE Conference on Computational Inteligence in Games (CIG).

[30] Kenneth O. Stanley,et al. Evolving Static Representations for Task Transfer , 2010, J. Mach. Learn. Res..

[31] Bobby D. Bryant,et al. Backpropagation without human supervision for visual control in Quake II , 2009, 2009 IEEE Symposium on Computational Intelligence and Games.

[32] András Lörincz,et al. Learning Tetris Using the Noisy Cross-Entropy Method , 2006, Neural Computation.

[33] Lukasz Kaiser,et al. Learning Games from Videos Guided by Descriptive Complexity , 2012, AAAI.

[34] Bobby D. Bryant,et al. Visual control in quake II with a cyclic controller , 2008, 2008 IEEE Symposium On Computational Intelligence and Games.

[35] Risto Miikkulainen,et al. General Video Game Playing , 2013, Artificial and Computational Intelligence in Games.

[36] Kenneth O. Stanley and Bobby D. Bryant and Risto Miikkulainen,et al. Evolving Neural Network Agents in the NERO Video Game , 2005 .

[37] Gabriella Kókai,et al. Evolving a Heuristic Function for the Game of Tetris , 2004, LWA.

[38] Andre Cohen,et al. An object-oriented representation for efficient reinforcement learning , 2008, ICML '08.

[39] Julian Togelius,et al. Measuring Intelligence through Games , 2011, ArXiv.

[40] Peter Stone,et al. Scaling Reinforcement Learning toward RoboCup Soccer , 2001, ICML.