Optimizing robot striking movement primitives with Iterative Learning Control

Highly dynamic tasks that require large accelerations and precise tracking usually rely on precise models and/or high gain feedback. While movement primitives allow for efficient representation of such tasks from demonstrations, the optimization of the required motor commands for systems with inaccurate dynamic models remains an open problem. To achieve accurate tracking for such tasks, we investigate two related Iterative Learning Control update laws and present a variant suited for optimizing hitting movement primitives. The resulting algorithm generalizes well to different initial conditions and naturally addresses striking movements where reaching specific velocities at certain positions is crucial. We evaluate the performance of our approach in a simulated putting task as well as in robotic table tennis, where we show how the striking performance of a seven degree of freedom anthropomorphic arm can be optimized. Our final implemented algorithm compares favorably with two state-of-the-art approaches.

[1]  Oliver Kroemer,et al.  Learning to select and generalize striking movements in robot table tennis , 2012, AAAI Fall Symposium: Robots Learning Interactively from Human Teachers.

[2]  Ian R. Manchester,et al.  Combined ILC and Disturbance Observer for the Rejection of Near-Repetitive Disturbances, With Application to Excavation , 2015, IEEE Transactions on Control Systems Technology.

[3]  Jun Nakanishi,et al.  Dynamical Movement Primitives: Learning Attractor Models for Motor Behaviors , 2013, Neural Computation.

[4]  Jun Nakanishi,et al.  Movement imitation with nonlinear dynamical systems in humanoid robots , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[5]  Andrej Gams,et al.  Modulation of motor primitives using force feedback: Interaction with the environment and bimanual tasks , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[6]  Svante Gunnarsson,et al.  On the design of ILC algorithms using optimization , 2001, Autom..

[7]  Jan Peters,et al.  Probabilistic Movement Primitives , 2013, NIPS.

[8]  William D. Smart,et al.  Receding Horizon Differential Dynamic Programming , 2007, NIPS.

[9]  Jan Peters,et al.  Policy Search for Motor Primitives in Robotics , 2008, NIPS 2008.

[10]  Christoph H. Lampert,et al.  Movement templates for learning of hitting and batting , 2010, 2010 IEEE International Conference on Robotics and Automation.

[11]  Andreas Krause,et al.  Advances in Neural Information Processing Systems (NIPS) , 2014 .

[12]  M. Spong,et al.  Robot Modeling and Control , 2005 .

[13]  Stefan Schaal,et al.  Learning Policy Improvements with Path Integrals , 2010, AISTATS.

[14]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[15]  A.G. Alleyne,et al.  A survey of iterative learning control , 2006, IEEE Control Systems.

[16]  Yong Fang,et al.  2-D analysis for iterative learning controller for discrete-time systems with variable initial conditions , 2003 .

[17]  J. Ghosh,et al.  Pseudo-inverse based iterative learning control for nonlinear plants with disturbances , 1999, Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304).

[18]  Carl E. Rasmussen,et al.  PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.

[19]  Christoph H. Lampert,et al.  Real-time detection of colored objects in multiple camera streams with off-the-shelf hardware components , 2012, Journal of Real-Time Image Processing.

[20]  T. Flash,et al.  The coordination of arm movements: an experimentally confirmed mathematical model , 1985, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[21]  Krzysztof Galkowski,et al.  Control Systems Theory and Applications for Linear Repetitive Processes - Recent Progress and Open Research Questions , 2007 .

[22]  Si-Zhao Joe Qin,et al.  A two-stage iterative learning control technique combined with real-time feedback for independent disturbance rejection , 2004, Autom..

[23]  E. Rogers,et al.  Iterative learning control for discrete-time systems with exponential rate of convergence , 1996 .

[24]  Pieter Abbeel,et al.  Superhuman performance of surgical tasks by robots using iterative learning from human-guided demonstrations , 2010, 2010 IEEE International Conference on Robotics and Automation.

[25]  Maarten Steinbuch,et al.  Aspects in inferential Iterative Learning Control: A 2D systems analysis , 2014, 53rd IEEE Conference on Decision and Control.

[26]  Suguru Arimoto,et al.  Bettering operation of Robots by learning , 1984, J. Field Robotics.

[27]  Richard W. Longman,et al.  Iterative learning control and repetitive control for engineering practice , 2000 .

[28]  Raffaello D'Andrea,et al.  Optimization-based iterative learning for precise quadrocopter trajectory tracking , 2012, Autonomous Robots.

[29]  Kwang-Hyun Park,et al.  A generalized iterative learning controller against initial state error , 2000 .

[30]  Madhukar Pandit,et al.  An iterative learning controller with reduced sampling rate for plants with variations of initial states , 2000 .

[31]  Kevin L. Moore,et al.  Iterative Learning Control: Brief Survey and Categorization , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[32]  Svante Gunnarsson,et al.  Time and frequency domain convergence properties in iterative learning control , 2002 .

[33]  Daniel Liberzon,et al.  Calculus of Variations and Optimal Control Theory: A Concise Introduction , 2012 .

[34]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[35]  Yasemin Altun,et al.  Relative Entropy Policy Search , 2010 .