Learning inverse dynamics models in O(n) time with LSTM networks

Inverse dynamics model learning is crucial for modern robots where analytic models cannot capture the complex dynamics of compliant actuators, elasticities, mechanical inaccuracies, frictional effects or sensor noise. However, such models are highly nonlinear and millions of samples are needed to encode a large number of motor skills. Thus, current state of the art model learning approaches like Gaussian Processes which scale exponentially with the data cannot be applied. In this work, we developed an inverse dynamics model learning approach based on a long-short-term-memory (LSTM) network with a time complexity of O(n). We evaluated the approach on a KUKA robot arm that was used in object manipulation skills with various loads. In a comparison to Gaussian Processes we show that LSTM networks achieve better prediction performances and that they can be trained on large datasets with more than 100,000 samples in a few seconds. Moreover, due to the small training batch size of for example 128 samples, the network can be continuously improved in life-long learning scenarios.

[1]  Stefan Schaal,et al.  Learning Control in Robotics , 2010, IEEE Robotics & Automation Magazine.

[2]  Olivier Sigaud,et al.  On-line regression algorithms for learning mechanical models of robots: A survey , 2011, Robotics Auton. Syst..

[3]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Jan Peters,et al.  Extracting low-dimensional control variables for movement primitives , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[5]  Carl E. Rasmussen,et al.  A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[6]  Andrew W. Moore,et al.  Locally Weighted Learning for Control , 1997, Artificial Intelligence Review.

[7]  Darwin G. Caldwell,et al.  Learning-based control strategy for safe human-robot interaction exploiting task and robot redundancies , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[8]  Jan Peters,et al.  Learning inverse dynamics models with contacts , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[9]  Stefan Schaal,et al.  Locally Weighted Projection Regression : An O(n) Algorithm for Incremental Real Time Learning in High Dimensional Space , 2000 .

[10]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[11]  Carl E. Rasmussen,et al.  Infinite Mixtures of Gaussian Process Experts , 2001, NIPS.

[12]  Marc Toussaint,et al.  Learned graphical models for probabilistic planning provide a new class of movement primitives , 2013, Front. Comput. Neurosci..

[13]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[14]  Amy Loutfi,et al.  A review of unsupervised feature learning and deep learning for time-series modeling , 2014, Pattern Recognit. Lett..

[15]  Dongheui Lee,et al.  Incremental kinesthetic teaching of motion primitives using the motion refinement tube , 2011, Auton. Robots.

[16]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[17]  Jan Peters,et al.  Model learning for robot control: a survey , 2011, Cognitive Processing.

[18]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[19]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[20]  Ronald J. Williams,et al.  Gradient-based learning algorithms for recurrent networks and their computational complexity , 1995 .

[21]  Carl E. Rasmussen,et al.  In Advances in Neural Information Processing Systems , 2011 .

[22]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[23]  Duy Nguyen-Tuong,et al.  Local Gaussian Process Regression for Real Time Online Model Learning , 2008, NIPS.

[24]  Daniel Roggen,et al.  Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition , 2016, Sensors.

[25]  Jan Peters,et al.  Recurrent Spiking Networks Solve Planning Tasks , 2016, Scientific Reports.

[26]  Volker Tresp,et al.  Mixtures of Gaussian Processes , 2000, NIPS.

[27]  Jan Peters,et al.  Probabilistic Prioritization of Movement Primitives , 2017, IEEE Robotics and Automation Letters.

[28]  Irene A. Stegun,et al.  Handbook of Mathematical Functions. , 1966 .

[29]  S. Schaal,et al.  Drifting Gaussian Process Regression for Inverse Dynamics Learning , 2015 .