Towards learning hierarchical skills for multi-phase manipulation tasks

Most manipulation tasks can be decomposed into a sequence of phases, where the robot's actions have different effects in each phase. The robot can perform actions to transition between phases and, thus, alter the effects of its actions, e.g. grasp an object in order to then lift it. The robot can thus reach a phase that affords the desired manipulation. In this paper, we present an approach for exploiting the phase structure of tasks in order to learn manipulation skills more efficiently. Starting with human demonstrations, the robot learns a probabilistic model of the phases and the phase transitions. The robot then employs model-based reinforcement learning to create a library of motor primitives for transitioning between phases. The learned motor primitives generalize to new situations and tasks. Given this library, the robot uses a value function approach to learn a high-level policy for sequencing the motor primitives. The proposed method was successfully evaluated on a real robot performing a bimanual grasping task.

[1]  L. Baum,et al.  An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[2]  Chen K. Tham,et al.  Reinforcement learning of multiple tasks using a hierarchical CMAC architecture , 1995, Robotics Auton. Syst..

[3]  O. Fuentes,et al.  Hierarchical learning of reactive behaviors in an autonomous mobile robot , 1995, 1995 IEEE International Conference on Systems, Man and Cybernetics. Intelligent Systems for the 21st Century.

[4]  Jun Nakanishi,et al.  Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.

[5]  Jun Nakanishi,et al.  Movement imitation with nonlinear dynamical systems in humanoid robots , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[6]  Jun Nakanishi,et al.  Learning Movement Primitives , 2005, ISRR.

[7]  Robert D. Howe,et al.  Contact State Estimation Using Multiple Model Estimation and Hidden Markov Models , 2004, Int. J. Robotics Res..

[8]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[9]  Vishal Soni,et al.  Reinforcement learning of hierarchical skills on the sony aibo robot , 2005, AAAI 2005.

[10]  Miles C. Bowman,et al.  Control strategies in object manipulation tasks , 2006, Current Opinion in Neurobiology.

[11]  David Barber,et al.  Expectation Correction for Smoothed Inference in Switching Linear Dynamical Systems , 2006, J. Mach. Learn. Res..

[12]  J. Randall Flanagan,et al.  Coding and use of tactile signals from the fingertips in object manipulation tasks , 2009, Nature Reviews Neuroscience.

[13]  Yasemin Altun,et al.  Relative Entropy Policy Search , 2010 .

[14]  Daniel H. Grollman,et al.  Incremental learning of subtasks from unsegmented demonstration , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[15]  Hong Liu,et al.  Experimental study on impedance control for the five-finger dexterous robot hand DLR-HIT II , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[16]  Odest Chadwicke Jenkins,et al.  Learning from demonstration using a multi-valued function regressor for time-series data , 2010, 2010 10th IEEE-RAS International Conference on Humanoid Robots.

[17]  Stefan Schaal,et al.  Movement segmentation using a primitive library , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[18]  Stephen Hart,et al.  Learning Generalizable Control Programs , 2011, IEEE Transactions on Autonomous Mental Development.

[19]  Oliver Kroemer,et al.  A Non-Parametric Approach to Dynamic Programming , 2011, NIPS.

[20]  Sachin Chitta,et al.  Human-Inspired Robotic Grasp Control With Tactile Sensing , 2011, IEEE Transactions on Robotics.

[21]  Victor Ng-Thow-Hing,et al.  Randomized multi-modal motion planning for a humanoid robot manipulation task , 2011, Int. J. Robotics Res..

[22]  Jan Peters,et al.  Learning elementary movements jointly with a higher level task , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[23]  Scott Kuindersma,et al.  Robot learning from demonstration by constructing skill trees , 2012, Int. J. Robotics Res..

[24]  Stefan Schaal,et al.  Towards Associative Skill Memories , 2012, 2012 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012).

[25]  Benjamin Kuipers,et al.  Autonomous Learning of High-Level States and Actions in Continuous Environments , 2012, IEEE Transactions on Autonomous Mental Development.

[26]  Stefan Schaal,et al.  Reinforcement Learning With Sequences of Motion Primitives for Robust Manipulation , 2012, IEEE Transactions on Robotics.

[27]  Tamim Asfour,et al.  Action sequence reproduction based on automatic segmentation and Object-Action Complexes , 2013, 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids).

[28]  Sheldon Andrews,et al.  Goal directed multi-finger manipulation: Control policies and analysis , 2013, Comput. Graph..

[29]  Scott Niekum,et al.  Incremental Semantically Grounded Learning from Demonstration , 2013, Robotics: Science and Systems.

[30]  Leslie Pack Kaelbling,et al.  A hierarchical approach to manipulation with diverse actions , 2013, 2013 IEEE International Conference on Robotics and Automation.

[31]  Jan Peters,et al.  A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.

[32]  Oliver Kroemer,et al.  Learning sequential motor tasks , 2013, 2013 IEEE International Conference on Robotics and Automation.

[33]  Danica Kragic,et al.  Data-Driven Grasp Synthesis—A Survey , 2013, IEEE Transactions on Robotics.

[34]  Oliver Kroemer,et al.  Predicting object interactions from contact distributions , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[35]  Oliver Kroemer,et al.  Learning to predict phases of manipulation tasks as hidden states , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[36]  Nolan Wagener,et al.  Learning contact-rich manipulation skills with guided policy search , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[37]  Siddhartha S. Srinivasa,et al.  Pre- and post-contact policy decomposition for planar contact manipulation under uncertainty , 2014, Int. J. Robotics Res..

[38]  U. Feige,et al.  Spectral Graph Theory , 2015 .