Autonomous learning of abstractions using Curiosity-Driven Modular Incremental Slow Feature Analysis

To autonomously learn behaviors in complex environments, vision-based agents need to develop useful sensory abstractions from high-dimensional video. We propose a modular, curiosity-driven learning system that autonomously learns multiple abstract representations. The policy to build the library of abstractions is adapted through reinforcement learning, and the corresponding abstractions are learned through incremental slow-feature analysis (IncSFA). IncSFA learns each abstraction based on how the inputs change over time, directly from unprocessed visual data. Modularity is induced via a gating system, which also prevents abstraction duplication. The system is driven by a curiosity signal that is based on the learnability of the inputs by the current adaptive module. After the learning completes, the result is multiple slow-feature modules serving as distinct behavior-specific abstractions. Experiments with a simulated iCub humanoid robot show how the proposed method effectively learns a set of abstractions from raw un-preprocessed video, to our knowledge the first curious learning agent to demonstrate this ability.

[1]  Jürgen Schmidhuber,et al.  Incremental Slow Feature Analysis: Adaptive Low-Complexity Slow Feature Updating from High-Dimensional Input Streams , 2012, Neural Computation.

[2]  Michael Werman,et al.  An On-Line Agglomerative Clustering Method for Nonstationary Data , 1999, Neural Computation.

[3]  Robert A. Legenstein,et al.  Reinforcement Learning on Slow Features of High-Dimensional Input Streams , 2010, PLoS Comput. Biol..

[4]  Henning Sprekeler,et al.  On the Relation of Slow Feature Analysis and Laplacian Eigenmaps , 2011, Neural Computation.

[5]  Michail G. Lagoudakis,et al.  Model-Free Least-Squares Policy Iteration , 2001, NIPS.

[6]  Daoqiang Zhang,et al.  Improving the Robustness of ‘Online Agglomerative Clustering Method’ Based on Kernel-Induce Distance Measures , 2005, Neural Processing Letters.

[7]  Andrew G. Barto,et al.  Efficient skill learning using abstraction selection , 2009, IJCAI 2009.

[8]  Ann B. Lee,et al.  Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Jürgen Schmidhuber,et al.  Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[10]  Andrew G. Barto,et al.  Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining , 2009, NIPS.

[11]  A. Barto,et al.  Intrinsic Motivation For Reinforcement Learning Systems , 2005 .

[12]  R. Coifman,et al.  Diffusion Wavelets , 2004 .

[13]  Scott Kuindersma,et al.  Constructing Skill Trees for Reinforcement Learning Agents from Demonstration Trajectories , 2010, NIPS.

[14]  Terrence J. Sejnowski,et al.  Slow Feature Analysis: Unsupervised Learning of Invariances , 2002, Neural Computation.

[15]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[16]  Arthur D. Szlam,et al.  Diffusion wavelet packets , 2006 .

[17]  Giulio Sandini,et al.  The iCub humanoid robot: an open platform for research in embodied cognition , 2008, PerMIS.

[18]  Jürgen Schmidhuber,et al.  AutoIncSFA and vision-based developmental learning for humanoid robots , 2011, 2011 11th IEEE-RAS International Conference on Humanoid Robots.

[19]  Sridhar Mahadevan,et al.  Proto-value functions: developmental reinforcement learning , 2005, ICML.

[20]  Jürgen Schmidhuber,et al.  Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.