Learning Algorithmic Solutions to Symbolic Planning Tasks with a Neural Computer

A key feature of intelligent behavior is the ability to learn abstract strategies that transfer to unfamiliar problems. Therefore, we present a novel architecture, based on memory-augmented networks, that is inspired by the von Neumann and Harvard architectures of modern computers. This architecture enables the learning of abstract algorithmic solutions via Evolution Strategies in a reinforcement learning setting. Applied to Sokoban, sliding block puzzle and robotic manipulation tasks, we show that the architecture can learn algorithmic solutions with strong generalization and abstraction: scaling to arbitrary task configurations and complexities, and being independent of both the data representation and the task domain.

[1]  Samy Bengio,et al.  Can Active Memory Replace Attention? , 2016, NIPS.

[2]  Pierre-Yves Oudeyer,et al.  What is Intrinsic Motivation? A Typology of Computational Approaches , 2007, Frontiers Neurorobotics.

[3]  Tom Schaul,et al.  Natural Evolution Strategies , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  Michael C. Mozer,et al.  A Connectionist Symbol Manipulator that Discovers the Structure of Context-Free Languages , 1992, NIPS.

[6]  Risto Miikkulainen,et al.  Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[7]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[8]  Wojciech Zaremba,et al.  Learning to Execute , 2014, ArXiv.

[9]  Colin Giles,et al.  Learning Context-free Grammars: Capabilities and Limitations of a Recurrent Neural Network with an External Stack Memory (cid:3) , 1992 .

[10]  Benjamin Recht,et al.  Simple random search of static linear policies is competitive for reinforcement learning , 2018, NeurIPS.

[11]  Sebastian Risi,et al.  Evolving Neural Turing Machines for Reward-based Learning , 2016, GECCO.

[12]  Qiang Yang,et al.  Lifelong Machine Learning Systems: Beyond Learning Algorithms , 2013, AAAI Spring Symposium: Lifelong Machine Learning.

[13]  Sergio Gomez Colmenarejo,et al.  Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[14]  Sebastian Risi,et al.  HyperENTM: Evolving Scalable Neural Turing Machines through HyperNEAT , 2017, ArXiv.

[15]  Padhraic Smyth,et al.  Discrete recurrent neural networks for grammatical inference , 1994, IEEE Trans. Neural Networks.

[16]  Anders Krogh,et al.  A Simple Weight Decay Can Improve Generalization , 1991, NIPS.

[17]  Peter Stone,et al.  Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[18]  Richard Socher,et al.  Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[19]  Wojciech Zaremba,et al.  Learning Simple Algorithms from Examples , 2015, ICML.

[20]  Joel Z. Leibo,et al.  Unsupervised Predictive Memory in a Goal-Directed Agent , 2018, ArXiv.

[21]  Marco Mirolli,et al.  Intrinsically Motivated Learning Systems: An Overview , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.

[22]  Xi Chen,et al.  Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.

[23]  Tomas Mikolov,et al.  Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets , 2015, NIPS.

[24]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[25]  Kenneth O. Stanley,et al.  Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents , 2017, NeurIPS.

[26]  Tillman Weyde,et al.  Feed-Forward Neural Networks Need Inductive Bias to Learn Equality Relations , 2018, ArXiv.

[27]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[28]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.

[29]  Lukasz Kaiser,et al.  Neural GPUs Learn Algorithms , 2015, ICLR.

[30]  Longxin Lin Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching , 2004, Machine Learning.

[31]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[32]  Quoc V. Le,et al.  Neural Programmer: Inducing Latent Programs with Gradient Descent , 2015, ICLR.

[33]  Charles Kemp,et al.  How to Grow a Mind: Statistics, Structure, and Abstraction , 2011, Science.

[34]  Jason Weston,et al.  Memory Networks , 2014, ICLR.

[35]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.