Intrinsically Motivated Learning of Hierarchical Collections of Skills

Humans and other animals often engage in activities for their own sakes rather than as steps toward solving practical problems. Psychologists call these intrinsically motivated behaviors. What we learn during intrinsically motivated behavior is essential for our development as competent autonomous entities able to efficiently solve a wide range of practical problems as they arise. In this paper we present initial results from a computational study of intrinsically motivated learning aimed at allowing artificial agents to construct and extend hierarchies of reusable skills that are needed for competent autonomy. At the core of the model are recent theoretical and algorithmic advances in computational reinforcement learning, specifically, new concepts related to skills and new learning algorithms for learning with skill hierarchies.

[1]  K. Groos The Play of Man , 1901, Nature.

[2]  R. W. White Motivation reconsidered: the concept of competence. , 1959, Psychological review.

[3]  P. L. Adams THE ORIGINS OF INTELLIGENCE IN CHILDREN , 1976 .

[4]  Edward L. Deci,et al.  Intrinsic Motivation and Self-Determination in Human Behavior , 1975, Perspectives in Social Psychology.

[5]  K. Miller,et al.  Intrinsic Motivation and Self-Determination in Human Behavior , 1975, Perspectives in Social Psychology.

[6]  Richard S. Sutton,et al.  Integrated Modeling and Control Based on Reinforcement Learning and Dynamic Programming , 1990, NIPS 1990.

[7]  Jürgen Schmidhuber,et al.  A possibility for implementing curiosity and boredom in model-building neural controllers , 1991 .

[8]  Joel L. Davis,et al.  A Model of How the Basal Ganglia Generate and Use Neural Signals That Predict Reinforcement , 1994 .

[9]  Karl J. Friston,et al.  Value-dependent selection in the brain: Simulation in a synthetic neural model , 1994, Neuroscience.

[10]  S. Hochreiter,et al.  REINFORCEMENT DRIVEN INFORMATION ACQUISITION IN NONDETERMINISTIC ENVIRONMENTS , 1995 .

[11]  P. Dayan,et al.  A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[12]  T. Nokes,et al.  Intrinsic reinforcing properties of putatively neutral stimuli in an instrumental two-lever discrimination task , 1996 .

[13]  J. Horvitz,et al.  Burst activity of ventral tegmental dopamine neurons is elicited by sensory stimuli in the awake cat , 1997, Brain Research.

[14]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[15]  Jean-Arcady Meyer,et al.  Learning Hierarchical Control Structures for Multiple Tasks and Changing Environments , 1998 .

[16]  G. Di Chiara Drug addiction as dopamine-dependent associative learning disorder. , 1999, European journal of pharmacology.

[17]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[18]  Andrew Y. Ng,et al.  Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.

[19]  Peter Dayan,et al.  Dopamine Bonuses , 2000, NIPS.

[20]  Rob Saunders,et al.  Curious Design Agents and Artificial Creativity - A Synthetic Approach to the Study of Creative Behaviour , 2001 .

[21]  James L. McClelland,et al.  Autonomous Mental Development by Robots and Animals , 2001, Science.

[22]  Andrew G. Barto,et al.  Autonomous discovery of temporal abstractions from interaction with an environment , 2002 .

[23]  Bernhard Hengst,et al.  Discovering Hierarchy in Reinforcement Learning with HEXQ , 2002, ICML.

[24]  Paul E. Utgoff,et al.  Many-Layered Learning , 2002, Neural Computation.

[25]  Xiao Huang,et al.  Novelty and Reinforcement Learning in the Value System of Developmental Robots , 2002 .

[26]  P. Dayan,et al.  Reward, Motivation, and Reinforcement Learning , 2002, Neuron.

[27]  Peter Dayan,et al.  Dopamine: generalization and bonuses , 2002, Neural Networks.

[28]  Pierre-Yves Oudeyer,et al.  Motivational principles for visual know-how development , 2003 .

[29]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[30]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[31]  Nuttapong Chentanez,et al.  Intrinsically Motivated Reinforcement Learning , 2004, NIPS.

[32]  Terrence J. Sejnowski,et al.  Exploration Bonuses and Dual Control , 1996, Machine Learning.

[33]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.