A Neuroevolution Approach to General

This paper addresses the challenge of learning to play many different video games with little domain-specific knowledge. Specifically, it introduces a neuroevolution approach to general Atari 2600 game playing. Four neuroevolution algo- rithms were paired with three different state representations and evaluated on a set of 61 Atari games. The neuroevolution agents represent different points along the spectrum of algorithmic sophistication—including weight evolution on topologically fixed neural networks (conventional neuroevolution), covariance ma- trix adaptation evolution strategy (CMA-ES), neuroevolution of augmenting topologies (NEAT), and indirect network encoding (HyperNEAT). State representations include an object represen- tation of the game screen, the raw pixels of the game screen, and seeded noise (a comparative baseline). Results indicate that di- rect-encoding methods work best on compact state representations while indirect-encoding methods (i.e., HyperNEAT) allow scaling to higher dimensional representations (i.e., the raw game screen). Previous approaches based on temporal-difference (TD) learning had trouble dealing with the large state spaces and sparse reward gradients often found in Atari games. Neuroevolution ameliorates these problems and evolved policies achieve state-of-the-art re- sults, even surpassing human high scores on three games. These results suggest that neuroevolution is a promising approach to general video game playing (GVGP).

[1]  Josh Bongard,et al.  Morphological change in machines accelerates the evolution of robust behavior , 2011, Proceedings of the National Academy of Sciences.

[2]  Andrea Lockerd Thomaz,et al.  Automatic State Abstraction from Demonstration , 2011, IJCAI.

[3]  Daniel Urieli,et al.  Design and Optimization of an Omnidirectional Humanoid Walk: A Winning Approach at the RoboCup 2011 3D Simulation Competition , 2012, AAAI.

[4]  Charles Ofria,et al.  Evolving coordinated quadruped gaits with the HyperNEAT generative encoding , 2009, 2009 IEEE Congress on Evolutionary Computation.

[5]  Shane Legg,et al.  An Approximation of the Universal Intelligence Measure , 2011, Algorithmic Probability and Friends.

[6]  Kenneth O. Stanley,et al.  A Case Study on the Critical Role of Geometric Regularity in Machine Learning , 2008, AAAI.

[7]  Marc'Aurelio Ranzato,et al.  Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  Samuel Wintermute,et al.  Using Imagery to Simplify Perceptual Abstraction in Reinforcement Learning Agents , 2010, AAAI.

[9]  Geoffrey E. Hinton,et al.  Generating Text with Recurrent Neural Networks , 2011, ICML.

[10]  Kenneth O. Stanley,et al.  A Hypercube-Based Encoding for Evolving Large-Scale Neural Networks , 2009, Artificial Life.

[11]  Dario Floreano,et al.  Neuroevolution: from architectures to learning , 2008, Evol. Intell..

[12]  Risto Miikkulainen,et al.  Evolving a Roving Eye for Go , 2004, GECCO.

[13]  Nikolaus Hansen,et al.  The CMA Evolution Strategy: A Comparing Review , 2006, Towards a New Evolutionary Computation.

[14]  Marc G. Bellemare,et al.  Sketch-Based Linear Value Function Approximation , 2012, NIPS.

[15]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[16]  Risto Miikkulainen,et al.  HyperNEAT-GGP: a hyperNEAT-based atari general game player , 2012, GECCO '12.

[17]  Yavar Naddaf,et al.  Game-independent AI agents for playing Atari 2600 console games , 2010 .

[18]  Peter Stone,et al.  TEXPLORE: real-time sample-efficient reinforcement learning for robots , 2012, Machine Learning.

[19]  Dr. Marcus Hutter,et al.  Universal artificial intelligence , 2004 .

[20]  Jordan B. Pollack,et al.  Automatic design and manufacture of robotic lifeforms , 2000, Nature.

[21]  Michael R. Genesereth,et al.  General Game Playing: Overview of the AAAI Competition , 2005, AI Mag..

[22]  Kenneth O. Stanley,et al.  On the Performance of Indirect Encoding Across the Continuum of Regularity , 2011, IEEE Transactions on Evolutionary Computation.

[23]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[24]  Marc G. Bellemare,et al.  Investigating Contingency Awareness Using Atari 2600 Games , 2012, AAAI.

[25]  Kenneth O. Stanley,et al.  Generative encoding for multiagent learning , 2008, GECCO '08.

[26]  Simon M. Lucas,et al.  Ms Pac-Man competition , 2007, SEVO.

[27]  Julian Togelius,et al.  Super mario evolution , 2009, 2009 IEEE Symposium on Computational Intelligence and Games.

[28]  Kenneth O. Stanley,et al.  Novelty Search and the Problem with Objectives , 2011 .

[29]  Tom Schaul,et al.  A video game description language for model-based or interactive learning , 2013, 2013 IEEE Conference on Computational Inteligence in Games (CIG).

[30]  Kenneth O. Stanley,et al.  Evolving Static Representations for Task Transfer , 2010, J. Mach. Learn. Res..

[31]  Bobby D. Bryant,et al.  Backpropagation without human supervision for visual control in Quake II , 2009, 2009 IEEE Symposium on Computational Intelligence and Games.

[32]  András Lörincz,et al.  Learning Tetris Using the Noisy Cross-Entropy Method , 2006, Neural Computation.

[33]  Lukasz Kaiser,et al.  Learning Games from Videos Guided by Descriptive Complexity , 2012, AAAI.

[34]  Bobby D. Bryant,et al.  Visual control in quake II with a cyclic controller , 2008, 2008 IEEE Symposium On Computational Intelligence and Games.

[35]  Risto Miikkulainen,et al.  General Video Game Playing , 2013, Artificial and Computational Intelligence in Games.

[36]  Kenneth O. Stanley and Bobby D. Bryant and Risto Miikkulainen,et al.  Evolving Neural Network Agents in the NERO Video Game , 2005 .

[37]  Gabriella Kókai,et al.  Evolving a Heuristic Function for the Game of Tetris , 2004, LWA.

[38]  Andre Cohen,et al.  An object-oriented representation for efficient reinforcement learning , 2008, ICML '08.

[39]  Julian Togelius,et al.  Measuring Intelligence through Games , 2011, ArXiv.

[40]  Peter Stone,et al.  Scaling Reinforcement Learning toward RoboCup Soccer , 2001, ICML.