Learning anticipation policies for robot table tennis

Playing table tennis is a difficult task for robots, especially due to their limitations of acceleration. A key bottleneck is the amount of time needed to reach the desired hitting position and velocity of the racket for returning the incoming ball. Here, it often does not suffice to simply extrapolate the ball's trajectory after the opponent returns it but more information is needed. Humans are able to predict the ball's trajectory based on the opponent's moves and, thus, have a considerable advantage. Hence, we propose to incorporate an anticipation system into robot table tennis players, which enables the robot to react earlier while the opponent is performing the striking movement. Based on visual observation of the opponent's racket movement, the robot can predict the aim of the opponent and adjust its movement generation accordingly. The policies for deciding how and when to react are obtained by reinforcement learning. We conduct experiments with an existing robot player to show that the learned reaction policy can significantly improve the performance of the overall system.

[1]  Ben Taskar,et al.  An Introduction to Conditional Random Fields for Relational Learning , 2007 .

[2]  Fumio Miyazaki,et al.  A learning approach to robotic table tennis , 2005, IEEE Transactions on Robotics.

[3]  Zoubin Ghahramani,et al.  Local and global sparse Gaussian process approximations , 2007, AISTATS.

[4]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[5]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[6]  Michail G. Lagoudakis,et al.  Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..

[7]  Marion Alexander,et al.  TABLE TENNIS: A BRIEF OVERVIEW OF BIOMECHANICAL ASPECTS OF THE GAME FOR COACHES AND PLAYERS , 2009 .

[8]  Jan Peters,et al.  A biomimetic approach to robot table tennis , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9]  Fumio Miyazaki,et al.  Learning to Dynamically Manipulate: A Table Tennis Robot Controls a Ball and Rallies with a Human Being , 2006 .

[10]  John T. Wen,et al.  A robot ping pong player: optimized mechanics, high performance 3D vision, and intelligent sensor control , 1990, Robotersysteme.

[11]  Nathan R. Sturtevant,et al.  Learning when to stop thinking and do something! , 2009, ICML '09.

[12]  Juan A. Méndez,et al.  Ping-pong player prototype , 2003, IEEE Robotics Autom. Mag..

[13]  L. Angel,et al.  RoboTenis: design, dynamic modeling and preliminary control , 2005, Proceedings, 2005 IEEE/ASME International Conference on Advanced Intelligent Mechatronics..

[14]  Christoph H. Lampert,et al.  Real-time detection of colored objects in multiple camera streams with off-the-shelf hardware components , 2012, Journal of Real-Time Image Processing.