Experiments with Hierarchical Reinforcement Learning of Multiple Grasping Policies

Robotic grasping has attracted considerable interest, but it still remains a challenging task. The data-driven approach is a promising solution to the robotic grasping problem; this approach leverages a grasp dataset and generalizes grasps for various objects. However, these methods often depend on the quality of the given datasets, which are not trivial to obtain with sufficient quality. Although reinforcement learning approaches have been recently used to achieve autonomous collection of grasp datasets, the existing algorithms are often limited to specific grasp types. In this paper, we present a framework for hierarchical reinforcement learning of grasping policies. In our framework, the lower-level hierarchy learns multiple grasp types, and the upper-level hierarchy learns a policy to select from the learned grasp types according to a point cloud of a new object. Through experiments, we validate that our approach learns grasping by constructing the grasp dataset autonomously. The experimental results show that our approach learns multiple grasping policies and generalizes the learned grasps by using local point cloud information.

[1]  R. Howe,et al.  Human grasp choice and robotic grasp analysis , 1990 .

[2]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  John F. Canny,et al.  Planning optimal grasps , 1992, Proceedings 1992 IEEE International Conference on Robotics and Automation.

[4]  Richard M. Murray,et al.  A Mathematical Introduction to Robotic Manipulation , 1994 .

[5]  Vijay Kumar,et al.  Robotic grasping and contact: a review , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[6]  Peter Auer,et al.  Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..

[7]  C. Rasmussen,et al.  Gaussian Process Priors with Uncertain Inputs - Application to Multiple-Step Ahead Time Series Forecasting , 2002, NIPS.

[8]  Agathe Girard,et al.  Prediction at an Uncertain Input for Gaussian Processes and Relevance Vector Machines Application to Multiple-Step Ahead Time-Series Forecasting , 2002 .

[9]  Christopher K. I. Williams,et al.  Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .

[10]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[11]  Stefan Schaal,et al.  Reinforcement learning by reward-weighted regression for operational space control , 2007, ICML '07.

[12]  Oliver Kroemer,et al.  Combining active learning and reactive control for robot grasping , 2010, Robotics Auton. Syst..

[13]  Yasemin Altun,et al.  Relative Entropy Policy Search , 2010 .

[14]  Peter K. Allen,et al.  Data-driven grasping , 2011, Auton. Robots.

[15]  S. Kakade,et al.  Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2012, IEEE Transactions on Information Theory.

[16]  Jan Peters,et al.  A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.

[17]  Danica Kragic,et al.  Classical grasp quality evaluation: New algorithms and theory , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[18]  Danica Kragic,et al.  Data-Driven Grasp Synthesis—A Survey , 2013, IEEE Transactions on Robotics.

[19]  Robert Platt,et al.  Localizing Handle-Like Grasp Affordances in 3D Point Clouds , 2014, ISER.

[20]  J. Fischer,et al.  The Prehensile Movements of the Human Hand , 2014 .

[21]  Honglak Lee,et al.  Deep learning for detecting robotic grasps , 2013, Int. J. Robotics Res..

[22]  Robert Platt,et al.  Using Geometry to Detect Grasp Poses in 3D Point Clouds , 2015, ISRR.

[23]  Markus Vincze,et al.  Learning grasps with topographic features , 2015, Int. J. Robotics Res..

[24]  Jan Peters,et al.  Bayesian optimization for learning gaits under uncertainty , 2015, Annals of Mathematics and Artificial Intelligence.

[25]  Sergey Levine,et al.  Learning Hand-Eye Coordination for Robotic Grasping with Large-Scale Data Collection , 2016, ISER.

[26]  Ales Leonardis,et al.  One-shot learning and generation of dexterous grasps for novel objects , 2016, Int. J. Robotics Res..

[27]  Abhinav Gupta,et al.  Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[28]  Ai Poh Loh,et al.  Model-based contextual policy search for data-efficient generalization of robot skills , 2017, Artif. Intell..

[29]  Sergey Levine,et al.  Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..