Deep Neural Network-Based Cooperative Visual Tracking Through Multiple Micro Aerial Vehicles

Multicamera tracking of humans and animals in outdoor environments is a relevant and challenging problem. Our approach to it involves a team of cooperating microaerial vehicles (MAVs) with on-board cameras only. Deep neural networks (DNNs) often fail at detecting small-scale objects or those that are far away from the camera, which are typical characteristics of a scenario with aerial robots. Thus, the core problem addressed in this letter is how to achieve on-board, online, continuous, and accurate vision-based detections using DNNs for visual person tracking through MAVs. Our solution leverages cooperation among multiple MAVs and active selection of most informative regions of image. We demonstrate the efficiency of our approach through simulations with up to 16 robots and real-robot experiments involving two aerial robots tracking a person, while maintaining an active perception-driven formation. ROS-based source code is provided for the benefit of the community.

[1]  Agathoniki Trigoni,et al.  Probabilistic search with agile UAVs , 2010, 2010 IEEE International Conference on Robotics and Automation.

[2]  Morgan Quigley,et al.  ROS: an open-source Robot Operating System , 2009, ICRA 2009.

[3]  Dongbing Gu,et al.  Cooperative Target Tracking Control of Multiple Robots , 2012, IEEE Transactions on Industrial Electronics.

[4]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[6]  Bernt Schiele,et al.  2D Human Pose Estimation: New Benchmark and State of the Art Analysis , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Bernt Schiele,et al.  Vision based victim detection from unmanned aerial vehicles , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[8]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Jiaolong Xu,et al.  Deep Convolutional Neural Networks for Forest Fire Detection , 2016 .

[10]  Zehang Sun,et al.  On-road vehicle detection: a review , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Andreas Zell,et al.  Robust nonlinear control approach to nontrivial maneuvers and obstacle avoidance for quadrotor UAV under disturbances , 2017, Robotics Auton. Syst..

[12]  Hans-Peter Seidel,et al.  Markerless Motion Capture of Multiple Characters Using Multiview Image Segmentation , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Curt Schurgers,et al.  Small Unmanned Aerial Vehicle System for Wildlife Radio Collar Tracking , 2014, 2014 IEEE 11th International Conference on Mobile Ad Hoc and Sensor Systems.

[14]  Marco A. Wehrmeister,et al.  Towards Real-Time People Recognition on Aerial Imagery Using Convolutional Neural Networks , 2016, 2016 IEEE 19th International Symposium on Real-Time Distributed Computing (ISORC).

[15]  Mark Campbell,et al.  Cooperative Geolocation and Sensor Bias Estimation for UAVs with Articulating Cameras , 2009 .

[16]  Bernt Schiele,et al.  Taking a deeper look at pedestrians , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Ming Yang,et al.  An Evaluation of the NVIDIA TX 1 for Supporting Real-time ComputerVision Workloads , 2017 .

[18]  Bodo Rosenhahn,et al.  Sparse Inertial Poser: Automatic 3D Human Pose Estimation from Sparse IMUs , 2017, Comput. Graph. Forum.

[19]  Sergi Hernández Juan,et al.  Multi-master ROS systems , 2015 .

[20]  Pedro U. Lima,et al.  An Online Scalable Approach to Unified Multirobot Cooperative Localization and Object Tracking , 2017, IEEE Transactions on Robotics.

[21]  Ran Duan,et al.  Onboard Robust Visual Tracking for UAVs Using a Reliable Global-Local Object Model , 2016, Sensors.

[22]  Heinrich H. Bülthoff,et al.  Dynamic baseline stereo vision-based cooperative target tracking , 2016, 2016 19th International Conference on Information Fusion (FUSION).

[23]  Michael J. Black,et al.  Detailed Full-Body Reconstructions of Moving People from Monocular RGB-D Sequences , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[24]  Marcelo G. S. Bruno,et al.  Cooperative Target Tracking Using Decentralized Particle Filtering and RSS Sensors , 2013, IEEE Transactions on Signal Processing.

[25]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[26]  Li Wei,et al.  A Solution to Cooperative Area Coverage Surveillance for a Swarm of MAVs , 2013 .

[27]  Giuseppe Lami,et al.  Deep Learning in Automotive Software , 2017, IEEE Software.

[28]  Ming Yang,et al.  An Evaluation of the NVIDIA TX1 for Supporting Real-Time Computer-Vision Workloads , 2017, 2017 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS).

[29]  André Dias,et al.  Formation control driven by cooperative object tracking , 2015, Robotics Auton. Syst..

[30]  Hugh H. T. Liu,et al.  Cooperative Tracking a Moving Target Using Multiple Fixed-wing UAVs , 2016, J. Intell. Robotic Syst..

[31]  Gaurav S. Sukhatme,et al.  Cooperative Control for Target Tracking with Onboard Sensing , 2014, ISER.

[32]  B. S. Manjunath,et al.  Are Very Deep Neural Networks Feasible on Mobile Devices , 2016 .

[33]  Qionghai Dai,et al.  FlyCap: Markerless Motion Capture Using Multiple Autonomous Flying Cameras , 2016, IEEE Transactions on Visualization and Computer Graphics.

[34]  Wojciech Matusik,et al.  Practical motion capture in everyday surroundings , 2007, SIGGRAPH 2007.