论文信息 - Reinforcement learning of hierarchical skills on the sony aibo robot

Reinforcement learning of hierarchical skills on the sony aibo robot

Humans frequently engage in activities for their own sake rather than as a step towards solving a specific task. During such behavior, which psychologists refer to as being intrinsically motivated, we often develop skills that allow us to exercise mastery over our environment. Reference [7] have recently proposed an algorithm for intrinsically motivated reinforcement learning (IMRL) aimed at constructing hierarchies of skills through self-motivated interaction of an agent with its environment. While they were able to successfully demonstrate the utility of IMRL in simulation, we present the first realization of this approach on a real robot. To this end, we implemented a control architecture for the Sony-AIBO robot that extends the IMRL algorithm to this platform. Through experiments, we examine whether the Aibo is indeed able to learn useful skill hierarchies.

Vishal Soni | Satinder Singh | Satinder Singh | Vishal Soni

[1] Peter Dayan,et al. Dopamine: generalization and bonuses , 2002, Neural Networks.

[2] Pierre-Yves Oudeyer,et al. Maximizing Learning Progress: An Internal Reward System for Development , 2003, Embodied Artificial Intelligence.

[3] Lisa Meeden,et al. Self-Motivated, Task-Independent Reinforcement Learning for Robots , 2004 .

[4] Amy McGovern. Autonomous Discovery of Abstractions through Interaction with an Environment , 2002, SARA.

[5] Nuttapong Chentanez,et al. Intrinsically Motivated Learning of Hierarchical Collections of Skills , 2004 .

[6] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[7] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[8] Pierre-Yves Oudeyer,et al. The Playground Experiment: Task-Independent Development of a Curious Robot , 2005 .

[9] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.