Learning Operational Space Control

While operational space control is of essential importance for robotics and well-understood from an analytical point of view, it can be prohibitively hard to achieve accurate control in face of modeling errors, which are inevitable in complex robots, e.g., humanoid robots. In such cases, learning control methods can offer an interesting alternative to analytical control algorithms. However, the resulting learning problem is ill-defined as it requires to learn an inverse mapping of a usually redundant system, which is well known to suffer from the property of non-convexity of the solution space, i.e., the learning system could generate motor commands that try to steer the robot into physically impossible configurations. A first important insight for this paper is that, nevertheless, a physically correct solution to the inverse problem does exit when learning of the inverse map is performed in a suitable piecewise linear way. The second crucial component for our work is based on a recent insight that many operational space controllers can be understood in terms of a constraint optimal control problem. The cost function associated with this optimal control problem allows us to formulate a learning algorithm that automatically synthesizes a globally consistent desired resolution of redundancy while learning the operational space controller. From the view of machine learning, the learning problem corresponds to a reinforcement learning problem that maximizes an immediate reward and that employs an expectation-maximization policy search algorithm. Evaluations on a three degrees of freedom robot arm illustrate the feasibility of the suggested approach.

[1]  John M. Hollerbach,et al.  Redundancy resolution of manipulators through torque optimization , 1985, Proceedings. 1985 IEEE International Conference on Robotics and Automation.

[2]  Oussama Khatib,et al.  A unified approach for motion and force control of robot manipulators: The operational space formulation , 1987, IEEE J. Robotics Autom..

[3]  John M. Hollerbach,et al.  Redundancy resolution of manipulators through torque optimization , 1987, IEEE J. Robotics Autom..

[4]  S. Sastry,et al.  Dynamic control of redundant manipulators , 1988, Proceedings. 1988 IEEE International Conference on Robotics and Automation.

[5]  A. Guez,et al.  Solution to the inverse kinematics problem in robotics by neural networks , 1988, IEEE 1988 International Conference on Neural Networks.

[6]  S. Shankar Sastry,et al.  Dynamic control of redundant manipulators , 1989, J. Field Robotics.

[7]  Michael I. Jordan,et al.  Forward Models: Supervised Learning with a Distal Teacher , 1992, Cogn. Sci..

[8]  Keith L. Doty,et al.  A Theory of Generalized Inverses Applied to Robotics , 1993, Int. J. Robotics Res..

[9]  S. Grossberg,et al.  A Self-Organizing Neural Model of Motor Equivalent Reaching and Tool Use by a Multijoint Arm , 1993, Journal of Cognitive Neuroscience.

[10]  R. Kalaba,et al.  Analytical Dynamics: A New Approach , 1996 .

[11]  Geoffrey E. Hinton,et al.  Using Expectation-Maximization for Reinforcement Learning , 1997, Neural Computation.

[12]  Mitsuo Kawato,et al.  Multiple Paired Forward-Inverse Models for Human Motor Learning and Control , 1998, NIPS.

[13]  Oussama Khatib,et al.  Gauss' principle and the dynamics of redundant and constrained manipulators , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[14]  Stefan Schaal,et al.  Inverse kinematics for humanoid robots , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[15]  Stefan Schaal,et al.  Learning inverse kinematics , 2001, Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180).

[16]  Stefan Schaal,et al.  Scalable Techniques from Nonparametric Statistics for Real Time Robot Learning , 2002, Applied Intelligence.

[17]  Jun Nakanishi,et al.  Learning composite adaptive control for a class of nonlinear systems , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[18]  Jun Nakanishi,et al.  A unifying methodology for the control of robotic systems , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[19]  Oussama Khatib,et al.  Control of Free-Floating Humanoid Robots Through Task Prioritization , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[20]  Jun Nakanishi,et al.  Comparative experiments on task space control with redundancy resolution , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.