Autonomous Blimp Control using Deep Reinforcement Learning

Aerial robot solutions are becoming ubiquitous for an increasing number of tasks. Among the various types of aerial robots, blimps are very well suited to perform long-duration tasks while being energy efficient, relatively silent and safe. To address the blimp navigation and control task, in our recent work [1] we have developed a software-in-the-loop simulation and a PID-based controller for large blimps in the presence of wind disturbance. However, blimps have a deformable structure and their dynamics are inherently non-linear and time-delayed, often resulting in large trajectory tracking errors. Moreover, the buoyancy of a blimp is constantly changing due to changes in the ambient temperature and pressure. In the present paper, we explore a deep reinforcement learning (DRL) approach to address these issues. We train only in simulation, while keeping conditions as close as possible to the real-world scenario. We derive a compact state representation to reduce the training time and a discrete action space to enforce control smoothness. Our initial results in simulation show a significant potential of DRL in solving the blimp control task and robustness against moderate wind and parameter uncertainty. Extensive experiments are presented to study the robustness of our approach. We also openly provide the source code of our approach.

[1]  Santiago Zazo,et al.  Robust Deep Reinforcement Learning for Underwater Navigation with Unknown Disturbances , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2]  Patrick Rives,et al.  Visual servo control for the hovering of all outdoor robotic airship , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[3]  Alexandra Moutinho,et al.  Hover Control of an UAV With Backstepping Design Including Input Saturations , 2008, IEEE Transactions on Control Systems Technology.

[4]  James F. Whidborne,et al.  Adaptive sliding-mode-backstepping trajectory tracking control of underactuated airships , 2020 .

[5]  Michael J. Black,et al.  Markerless Outdoor Human Motion Capture Using Multiple Autonomous Micro Aerial Vehicles , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[6]  Meyer Nahon,et al.  Modeling and Simulation of Airship Dynamics , 2007 .

[7]  Ioannis Pitas,et al.  High-Level Multiple-UAV Cinematography Tools for Covering Outdoor Events , 2019, IEEE Transactions on Broadcasting.

[8]  Yuchen Zhang,et al.  Bridging Theory and Algorithm for Domain Adaptation , 2019, ICML.

[9]  W. Grossman,et al.  Autonomous Searching and Tracking of a River using an UAV , 2007, 2007 American Control Conference.

[10]  Zewei Zheng,et al.  Three-Dimensional Path-Following Control of a Robotic Airship with Reinforcement Learning , 2019, International Journal of Aerospace Engineering.

[11]  Alexandra Moutinho,et al.  Stability and Robustness Analysis of the AURORA Airship Control System using Dynamic Inversion , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[12]  Dieter Fox,et al.  Gaussian Processes and Reinforcement Learning for Identification and Control of an Autonomous Blimp , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[13]  Shane Legg,et al.  IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.

[14]  Yasmina Bestaoui,et al.  Stabilization of a nonlinear underactuated autonomous airship-a combined averaging and backstepping approach , 2002, Proceedings of the Third International Workshop on Robot Motion and Control, 2002. RoMoCo '02..

[15]  Fazel Naghdy,et al.  Control of autonomous airship , 2009, 2009 IEEE International Conference on Robotics and Biomimetics (ROBIO).

[16]  Yuanjun Sang,et al.  Robust model predictive control for stratospheric airships using LPV design , 2018, Control Engineering Practice.

[17]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[18]  Roland Siegwart,et al.  Control of a Quadrotor With Reinforcement Learning , 2017, IEEE Robotics and Automation Letters.

[19]  Vinay Pratap Singh,et al.  Reinforcement learning in robotic applications: a comprehensive survey , 2021, Artificial Intelligence Review.

[20]  James P. Ostrowski,et al.  Visual servoing with dynamics: control of an unmanned blimp , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).

[21]  Masahito Yamamoto,et al.  PID landing orbit motion controller for an indoor blimp robot , 2006, Artificial Life and Robotics.

[22]  Michael I. Jordan,et al.  RLlib: Abstractions for Distributed Reinforcement Learning , 2017, ICML.

[23]  S. Shankar Sastry,et al.  Autonomous Helicopter Flight via Reinforcement Learning , 2003, NIPS.

[24]  Wolfram Burgard,et al.  Adaptive autonomous control using online value iteration with gaussian processes , 2009, 2009 IEEE International Conference on Robotics and Automation.

[25]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[26]  Zhenbang Gong,et al.  A flight control and navigation system of a small size unmanned airship , 2005, IEEE International Conference Mechatronics and Automation, 2005.

[27]  Michael J. Black,et al.  Simulation and Control of Deformable Autonomous Airships in Turbulent Wind , 2020, ArXiv.

[28]  Sameera S. Ponda,et al.  Autonomous navigation of stratospheric balloons using reinforcement learning , 2020, Nature.

[29]  Alexandre Bernardino,et al.  Vision based station keeping and docking for an aerial blimp , 2000, Proceedings. 2000 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2000) (Cat. No.00CH37113).

[30]  Koichi Osuka,et al.  Inverse optimal tracking control of an aerial blimp robot , 2005, Proceedings of the Fifth International Workshop on Robot Motion and Control, 2005. RoMoCo '05..

[31]  Henry Zhu,et al.  Dexterous Manipulation with Deep Reinforcement Learning: Efficient, General, and Low-Cost , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[32]  Takashi Kohno,et al.  Hovering control of outdoor blimp robots based on path following , 2010, 2010 IEEE International Conference on Control Applications.

[33]  Yaoliang Yu,et al.  Distributional Reinforcement Learning for Efficient Exploration , 2019, ICML.

[34]  Kate Saenko,et al.  Regularizing Action Policies for Smooth Control with Reinforcement Learning , 2020, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[35]  Geoffrey A. Hollinger,et al.  Multi-UAV exploration with limited communication and battery , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[36]  Gerardo G. Acosta,et al.  AUV Position Tracking Control Using End-to-End Deep Reinforcement Learning , 2018, OCEANS 2018 MTS/IEEE Charleston.

[37]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[38]  Hiroaki Fukushima,et al.  Model predictive control of an autonomous blimp with input and output constraints , 2006, 2006 IEEE Conference on Computer Aided Control System Design, 2006 IEEE International Conference on Control Applications, 2006 IEEE International Symposium on Intelligent Control.

[39]  Sergio Gomez Colmenarejo,et al.  Acme: A Research Framework for Distributed Reinforcement Learning , 2020, ArXiv.

[40]  Lin Cheng,et al.  Robust three-dimensional path-following control for an under-actuated stratospheric airship , 2019 .

[41]  Sandra Johnson,et al.  Unmanned Aerial Vehicles (UAVs) and Artificial Intelligence Revolutionizing Wildlife Monitoring and Conservation , 2016, Sensors.

[42]  Fangliang Chen,et al.  Detecting and tracking vehicles in traffic by unmanned aerial vehicles , 2016 .