论文信息 - Deep Learning of Inverse Dynamic Models

Deep Learning of Inverse Dynamic Models

In the last few years, deep learning has alleviated most fields of machine learning to new levels. However, deep learning has yet to prove its value in the field of model learning in robotics. Currently, dynamics models have to be derived analytically in a tedious process of measuring the mechanical system part by part. In applications like automated industrial production where high precision and reliability are required, robot movements are engineered precisely costing money and time. The reason why the models are not learned may be the fact that data-based approaches lack the physical plausibility of analytical models. In this thesis I propose Deep Lagrangian Network (DeLaN) a deep learning approach to inverse dynamics models. The novelty of DeLaN is the incorporation of a physics prior, the Lagrange-Euler partial differential equation. Thus, DeLaN encodes physical knowledge while facilitating standard model learning advantages, like end-to-end training using data samples. I compare my proposed approach to a standard feed-forward neural network (FF-NN) in various experiments to showcase the usefulness of DeLaN in learning inverse dynamics models online. I am also highlighting advantages and limitations of my approach based on the characteristics inherited from the physics prior. In a nutshell, DeLaN is able to learn and control a real robot and provides capabilities to extrapolate to unseen tasks. Zusammenfassung In den letzten Jahren hat Deep Learning viele Bereiche des Machine Learnings auf neue Ebenen gehoben. Deep Learning muss seinen Wert für Model Learning in der Robotik allerdings noch beweisen. Dynamische Modelle werden aktuell noch in einem aufwendigen Prozess analytisch hergeleitet, wobei das mechanische System Stück für Stück ausgemessen werden muss. In Anwendungen, wie der automatisierten industriellen Produktion, die hohe Präzision und Zuverlässigkeit benötigen, werden Roboterbewegungen präzise von Ingenieuren modelliert, was Zeit und Geld kostet. Der Grund, warum die Modelle aktuell nicht gelernt werden, könnte der Fakt sein, dass datenbasierte Ansätze nicht die physikalische Plausibilität analytischer Modelle bieten. In dieser Arbeit stelle ich Deep Lagrangian Network (DeLaN) vor, einen Deep Learning Ansatz für inverse Dynamik Modelle. Die Neuheit an DeLaN ist die Einbindung einer physikalischen Voraussetzung, der partiellen Lagrange-Euler Differentialgleichung. Damit enkodiert DeLaN physikalisches Wissen und erlaubt gleichzeitig die Nutzung von Vorteilen des klassischen Machine Learnings, wie Ende-zu-Ende Training basierend auf einzelnen Messwerten. Ich vergleiche meinen Ansatz in verschiedenen Experimenten zu einem klassischen Feed-Forward Neural Network, um die Anwendbarkeit DeLaNs auf Online Learning von inversen Dynamik Modellen zu zeigen. Außerdem stelle ich die Vorteile und Einschränkungen meines Ansatzes dar, die sich aus den Eigenschaften der eingebundenen physikalischen Voraussetzung ergeben. Kurz gesagt, DeLaN ist in der Lage in Echtzeit zu lernen und einen echten Roboter zu steuern und es bietet dabei die Fähigkeit zu neuen, unbekannten Aufgaben zu extrapolieren.

Jan Peters | M. Lutter | Christian Ritter

[1] Jan Peters,et al. Deep Lagrangian Networks: Using Physics as Model Prior for Deep Learning , 2019, ICLR.

[2] Carl E. Rasmussen,et al. PIPPS: Flexible Model-Based Policy Search Robust to the Curse of Chaos , 2019, ICML.

[3] Andrej Gams,et al. Accelerated Sensorimotor Learning of Compliant Movement Primitives , 2018, IEEE Transactions on Robotics.

[4] Zhi Zhang,et al. Efficient Force Control Learning System for Industrial Robots Based on Variable Impedance Control , 2018, Sensors.

[5] Christoph H. Lampert,et al. Learning Equations for Extrapolation and Control , 2018, ICML.

[6] Raia Hadsell,et al. Graph networks as learnable physics engines for inference and control , 2018, ICML.

[7] Frank L. Lewis,et al. Optimal and Autonomous Control Using Reinforcement Learning: A Survey , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[8] Sergey Levine,et al. Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.

[9] Mykel J. Kochenderfer,et al. Deep Dynamical Modeling and Control of Unsteady Fluid Flows , 2018, NeurIPS.

[10] Travis E. Gibson,et al. Robust and Scalable Models of Microbiome Dynamics , 2018, ICML.

[11] Bin Dong,et al. PDE-Net: Learning PDEs from Data , 2017, ICML.

[12] Justin A. Sirignano,et al. DGM: A deep learning algorithm for solving partial differential equations , 2017, J. Comput. Phys..

[13] George E. Karniadakis,et al. Hidden physics models: Machine learning of nonlinear partial differential equations , 2017, J. Comput. Phys..

[14] Marc Peter Deisenroth,et al. Data-Efficient Reinforcement Learning with Probabilistic Model Predictive Control , 2017, AISTATS.

[15] Paris Perdikaris,et al. Physics Informed Deep Learning (Part I): Data-driven Solutions of Nonlinear Partial Differential Equations , 2017, ArXiv.

[16] Sami Haddadin,et al. First-order-principles-based constructive network topologies: An application to robot inverse dynamics , 2017, 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids).

[17] Jan Peters,et al. Learning inverse dynamics models in O(n) time with LSTM networks , 2017, 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids).

[18] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.

[19] Siddhartha S. Srinivasa,et al. GP-ILQG: Data-driven Robust Optimal Control for Uncertain Nonlinear Dynamical Systems , 2017, ArXiv.

[20] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21] C. Rasmussen,et al. Improving PILCO with Bayesian Neural Network Dynamics Models , 2016 .

[22] Ross A. Knepper,et al. DeepMPC: Learning Deep Latent Features for Model Predictive Control , 2015, Robotics: Science and Systems.

[23] Hod Lipson,et al. Understanding Neural Networks Through Deep Visualization , 2015, ArXiv.

[24] Peter Englert,et al. Sparse Gaussian process regression for compliant, real-time robot control , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[25] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[26] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[27] Jürgen Schmidhuber,et al. Deep learning in neural networks: An overview , 2014, Neural Networks.

[28] Parul Parashar,et al. Neural Networks in Machine Learning , 2014 .

[29] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[30] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[31] A. C. Bittencourt,et al. Static Friction in a Robot Joint—Modeling and Identification of Load and Temperature Effects , 2012 .

[32] Andreas Griewank,et al. Who Invented the Reverse Mode of Differentiation , 2012 .

[33] Aude Billard,et al. Learning Stable Nonlinear Dynamical Systems With Gaussian Mixture Models , 2011, IEEE Transactions on Robotics.

[34] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[35] Jan Peters,et al. Incremental online sparsification for model learning in real-time robot control , 2011, Neurocomputing.

[36] Jan Peters,et al. Model learning for robot control: a survey , 2011, Cognitive Processing.

[37] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[38] Darwin G. Caldwell,et al. Learning and Reproduction of Gestures by Imitation , 2010, IEEE Robotics & Automation Magazine.

[39] Jan Peters,et al. Using model knowledge for learning inverse dynamics , 2010, 2010 IEEE International Conference on Robotics and Automation.

[40] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[41] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[42] Yong Wang,et al. A generalized-constraint neural network model: Associating partially known relationships for nonlinear regressions , 2009, Inf. Sci..

[43] Jan Peters,et al. Model Learning with Local Gaussian Process Regression , 2009, Adv. Robotics.

[44] Marc Toussaint,et al. Modelling motion primitives and their timing in biologically executed movements , 2007, NIPS.

[45] J.P. Ferreira,et al. Simulation control of a biped robot with Support Vector Regression , 2007, 2007 IEEE International Symposium on Intelligent Signal Processing.

[46] Nicolas Schweighofer,et al. Local Online Support Vector Regression for Learning Control , 2007, 2007 International Symposium on Computational Intelligence in Robotics and Automation.

[47] Olivier Chapelle,et al. Training a Support Vector Machine in the Primal , 2007, Neural Computation.

[48] A. Asuncion,et al. UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[49] Stefan Schaal,et al. LWPR: A Scalable Method for Incremental Online Learning in High Dimensions , 2005 .

[50] Jun Nakanishi,et al. Feedback error learning and nonlinear adaptive control , 2004, Neural Networks.

[51] J. Kocijan,et al. Gaussian process model based predictive control , 2004, Proceedings of the 2004 American Control Conference.

[52] Stefan Schaal,et al. Scalable Techniques from Nonparametric Statistics for Real Time Robot Learning , 2002, Applied Intelligence.

[53] Julian D. Olden,et al. Illuminating the “black box”: a randomization approach for understanding variable contributions in artificial neural networks , 2002 .

[54] A. Albu-Schäffer. Regelung von Robotern mit elastischen Gelenken am Beispiel der DLR-Leichtbauarme , 2002 .

[55] Mitsuo Kawato,et al. MOSAIC Model for Sensorimotor Learning and Control , 2001, Neural Computation.

[56] Giuseppe Carlo Calafiore,et al. Robot dynamic calibration: Optimal excitation trajectories and experimental parameter estimation , 2001, J. Field Robotics.

[57] Dimitris G. Papageorgiou,et al. Neural-network methods for boundary value problems with irregular boundaries , 2000, IEEE Trans. Neural Networks Learn. Syst..

[58] Yoshua Bengio,et al. Série Scientifique Scientific Series Incorporating Second-order Functional Knowledge for Better Option Pricing Incorporating Second-order Functional Knowledge for Better Option Pricing , 2022 .

[59] Rich Caruana,et al. Overfitting in Neural Nets: Backpropagation, Conjugate Gradient, and Early Stopping , 2000, NIPS.

[60] David P. Morton,et al. Monte Carlo bounding techniques for determining solution quality in stochastic programs , 1999, Oper. Res. Lett..

[61] Simon Haykin,et al. Neural Networks: A Comprehensive Foundation , 1998 .

[62] Dimitrios I. Fotiadis,et al. Artificial neural networks for solving ordinary and partial differential equations , 1997, IEEE Trans. Neural Networks.

[63] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[64] Carl E. Rasmussen,et al. In Advances in Neural Information Processing Systems , 2011 .

[65] M. Jansen. Learning an Accurate Neural Model of the Dynamics of a Typical Industrial Robot , 1994 .

[66] Michael I. Jordan,et al. Forward Models: Supervised Learning with a Distal Teacher , 1992, Cogn. Sci..

[67] Mitsuo Kawato,et al. Feedback-Error-Learning Neural Network for Supervised Motor Learning , 1990 .

[68] Benjamin F. Hobbs,et al. Is optimization optimistically biased , 1989 .

[69] J. Slotine,et al. On the Adaptive Control of Robot Manipulators , 1987 .

[70] K. Narendra,et al. Persistent excitation in adaptive systems , 1987 .

[71] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.

[72] Christopher G. Atkeson,et al. Estimation of Inertial Parameters of Manipulator Loads and Links , 1986 .

[73] John J. Craig,et al. Introduction to Robotics Mechanics and Control , 1986 .

[74] Takeo Kanade,et al. Parameter identification of robot dynamics , 1985, 1985 24th IEEE Conference on Decision and Control.

[75] J. Y. S. Luh,et al. On-Line Computational Scheme for Mechanical Manipulators , 1980 .