DOOMED: Direct Online Optimization of Modeling Errors in Dynamics

It has long been hoped that model-based control will improve tracking performance while maintaining or increasing compliance. This hope hinges on having or being able to estimate an accurate inverse dynamics model. As a result, substantial effort has gone into modeling and estimating dynamics (error) models. Most recent research has focused on learning the true inverse dynamics using data points mapping observed accelerations to the torques used to generate them. Unfortunately, if the initial tracking error is bad, such learning processes may train substantially off-distribution to predict well on actual observed acceleration rather than the desired accelerations. This work takes a different approach. We define a class of gradient-based online learning algorithms we term Direct Online Optimization of Modeling Errors in Dynamics (DOOMED) that directly minimize an objective measuring the divergence between actual and desired accelerations. Our objective is defined in terms of the true system's unknown dynamics and is therefore impossible to evaluate. However, we show that its gradient is observable online from system data. We develop a novel adaptive control approach based on running online learning to directly correct (inverse) dynamics errors in real time using the data stream from the robot to accurately achieve desired accelerations during execution.

[1]  Stuart Barber,et al.  All of Statistics: a Concise Course in Statistical Inference , 2005 .

[2]  Giorgio Metta,et al.  Real-time model learning using Incremental Sparse Spectrum Gaussian Process Regression. , 2013, Neural networks : the official journal of the International Neural Network Society.

[3]  Weiping Li,et al.  Composite adaptive control of robot manipulators , 1989, Autom..

[4]  Jun Nakanishi,et al.  Dynamical Movement Primitives: Learning Attractor Models for Motor Behaviors , 2013, Neural Computation.

[5]  Stefan Schaal,et al.  Incremental Local Gaussian Regression , 2014, NIPS.

[6]  J. Slotine,et al.  On the Adaptive Control of Robot Manipulators , 1987 .

[7]  Larry Wasserman,et al.  All of Statistics , 2004 .

[8]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[9]  S. Shankar Sastry,et al.  Adaptive Control of Mechanical Manipulators , 1987, Proceedings. 1986 IEEE International Conference on Robotics and Automation.

[10]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[11]  John J. Craig,et al.  Introduction to Robotics Mechanics and Control , 1986 .

[12]  Athanasios S. Polydoros,et al.  Real-time deep learning of robotic manipulator inverse dynamics , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[13]  Petros A. Ioannou,et al.  Robust Adaptive Control , 2012 .

[14]  Gaurav S. Sukhatme,et al.  An autonomous manipulation system based on force control and optimization , 2014, Auton. Robots.

[15]  Giorgio Metta,et al.  Incremental semiparametric inverse dynamics learning , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[16]  J. W. Humberston Classical mechanics , 1980, Nature.

[17]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[18]  Jorge Nocedal,et al.  Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..

[19]  RatliffNathan,et al.  DOOMED: Direct Online Optimization of Modeling Errors in Dynamics , 2016 .

[20]  Claudio Gentile,et al.  On the generalization ability of on-line learning algorithms , 2001, IEEE Transactions on Information Theory.

[21]  Oussama Khatib,et al.  Gauss' principle and the dynamics of redundant and constrained manipulators , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[22]  C. Atkeson,et al.  Estimation of inertial parameters of rigid body links of manipulators , 1985, 1985 24th IEEE Conference on Decision and Control.

[23]  Stefan Schaal,et al.  Online movement adaptation based on previous sensor experiences , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[24]  Nathan D. Ratliff Subgradient Methods for Structured Prediction , 2007 .

[25]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[26]  Bernd Faust,et al.  Model-Based Control of a Robot Manipulator , 1988 .

[27]  Robert F. Stengel,et al.  Optimal Control and Estimation , 1994 .

[28]  Stefan Schaal,et al.  Drifting Gaussian processes with varying neighborhood sizes for online model learning , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[29]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[30]  Stefan Schaal,et al.  Locally Weighted Projection Regression : An O(n) Algorithm for Incremental Real Time Learning in High Dimensional Space , 2000 .

[31]  Duy Nguyen-Tuong,et al.  Local Gaussian Process Regression for Real Time Online Model Learning , 2008, NIPS.