论文信息 - AutoIncSFA and vision-based developmental learning for humanoid robots

AutoIncSFA and vision-based developmental learning for humanoid robots

Humanoids have to deal with novel, unsupervised high-dimensional visual input streams. Our new method AutoIncSFA learns to compactly represent such complex sensory input sequences by very few meaningful features corresponding to high-level spatio-temporal abstractions, such as: a person is approaching me, or: an object was toppled. We explain the advantages of AutoIncSFA over previous related methods, and show that the compact codes greatly facilitate the task of a reinforcement learner driving the humanoid to actively explore its world like a playing baby, maximizing intrinsic curiosity reward signals for reaching states corresponding to previously unpredicted AutoIncSFA features.

[1] Jürgen Schmidhuber,et al. Discovering Predictable Classifications , 1993, Neural Computation.

[2] S. Hochreiter,et al. REINFORCEMENT DRIVEN INFORMATION ACQUISITION IN NONDETERMINISTIC ENVIRONMENTS , 1995 .

[3] Jürgen Schmidhuber,et al. HQ-Learning , 1997, Adapt. Behav..

[4] Zhang Yi,et al. Convergence analysis of a simple minor component analysis algorithm , 2007, Neural Networks.

[5] Geoffrey E. Hinton,et al. Autoencoders, Minimum Description Length and Helmholtz Free Energy , 1993, NIPS.

[6] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[7] Terrence J. Sejnowski,et al. Slow Feature Analysis: Unsupervised Learning of Invariances , 2002, Neural Computation.

[8] I. Jolliffe. Principal Component Analysis , 2002 .

[9] David J. Field,et al. Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[10] Jürgen Schmidhuber,et al. Artificial curiosity based on discovering novel algorithmic predictability through coevolution , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[11] Jürgen Schmidhuber,et al. Incremental Slow Feature Analysis , 2011, IJCAI.

[12] Jürgen Schmidhuber,et al. Learning to generate sub-goals for action sequences , 1991 .

[13] Bram Bakker,et al. Hierarchical Reinforcement Learning Based on Subgoal Discovery and Subpolicy Specialization , 2003 .

[14] Giulio Sandini,et al. The iCub humanoid robot: an open platform for research in embodied cognition , 2008, PerMIS.

[15] Richard S. Sutton,et al. GQ(lambda): A general gradient algorithm for temporal-difference prediction learning with eligibility traces , 2010, Artificial General Intelligence.

[16] Jürgen Schmidhuber,et al. Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[17] R. Sutton,et al. GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces , 2010 .

[18] Jürgen Schmidhuber,et al. Feature Extraction Through LOCOCODE , 1999, Neural Computation.

[19] Laurenz Wiskott,et al. Slowness and Sparseness Lead to Place, Head-Direction, and Spatial-View Cells , 2007, PLoS Comput. Biol..

[20] Jürgen Schmidhuber,et al. Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.

[21] Jürgen Schmidhuber,et al. Driven by Compression Progress: A Simple Principle Explains Essential Aspects of Subjective Beauty, Novelty, Surprise, Interestingness, Attention, Curiosity, Creativity, Art, Science, Music, Jokes , 2008, ABiALS.

[22] Ricardo Vigário,et al. Nonlinear PCA: a new hierarchical approach , 2002, ESANN.

[23] Mark B. Ring. Incremental Development of Complex Behaviors , 1991, ML.

[24] Jürgen Schmidhuber,et al. Developmental robotics, optimal artificial curiosity, creativity, music, and the fine arts , 2006, Connect. Sci..

[25] Martin A. Riedmiller,et al. Deep auto-encoder neural networks in reinforcement learning , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[26] Juyang Weng,et al. Candid Covariance-Free Incremental Principal Component Analysis , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[27] Yoshua Bengio,et al. Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[28] Jürgen Schmidhuber,et al. Flat Minima , 1997, Neural Computation.

[29] Honglak Lee,et al. Sparse deep belief net model for visual area V2 , 2007, NIPS.

[30] Shun-ichi Amari,et al. Sequential Extraction of Minor Components , 2001, Neural Processing Letters.

[31] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[32] Erkki Oja,et al. Principal components, minor components, and linear neural networks , 1992, Neural Networks.

[33] Robert A. Legenstein,et al. Reinforcement Learning on Slow Features of High-Dimensional Input Streams , 2010, PLoS Comput. Biol..

[34] Marc'Aurelio Ranzato,et al. Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.