A Collaborative Visual Tracking Architecture for Correlation Filter and Convolutional Neural Network Learning

Visual object tracking has achieved remarkable progress in recent years and has been broadly applied in intelligent transportation systems such as autonomous vehicles and drones to monitor and analyze the behavior of specific targets. One typical tracking approach is the discriminative tracker, which branches into two main categories: the correlation filter (CF) and the convolutional neural network (CNN). However, most of the current researches consider both categories as two separate techniques and only rely on one of them. Thus, a dense cooperation between the CF and the CNN still remains less discovered and the question of how to effectively join both techniques to further boost the tracking performance is still open. To address this issue, in this paper, we propose a collaborative architecture which incorporates models constructed with both techniques and dynamically aggregates their response maps for target inference. By an alternating optimization, both models are learned on each other’s errors to persistently improve the classification power of the whole tracker. For further efficiency, we present a faster solver for our utilized CF and an analytical solution for dynamic model weighting. Through experiments on standard benchmarks, we reveal the influence of key factors on the joint learning architecture and show that it outperforms the state-of-the-art approaches.

[1]  Zhenyu He,et al.  The Thermal Infrared Visual Object Tracking VOT-TIR2016 Challenge Results , 2016, ECCV Workshops.

[2]  Horst Bischof,et al.  Real-Time Tracking via On-line Boosting , 2006, BMVC.

[3]  Michael Felsberg,et al.  Accurate Scale Estimation for Robust Visual Tracking , 2014, BMVC.

[4]  A. Aydın Alatan,et al.  Good Features to Correlate for Visual Tracking , 2017, IEEE Transactions on Image Processing.

[5]  Jianke Zhu,et al.  A Scale Adaptive Kernel Correlation Filter Tracker with Feature Integration , 2014, ECCV Workshops.

[6]  Michael Felsberg,et al.  The Visual Object Tracking VOT2017 Challenge Results , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[7]  Yun Fu,et al.  Multi-Stream Deep Similarity Learning Networks for Visual Tracking , 2017, IJCAI.

[8]  Philip H. S. Torr,et al.  Struck: Structured output tracking with kernels , 2011, ICCV.

[9]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Michael Felsberg,et al.  Adaptive Color Attributes for Real-Time Visual Tracking , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Bohyung Han,et al.  Learning Multi-domain Convolutional Neural Networks for Visual Tracking , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Simon Lucey,et al.  Learning Background-Aware Correlation Filters for Visual Tracking , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Qingming Huang,et al.  Hedged Deep Tracking , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Horst Bischof,et al.  PROST: Parallel robust online simple tracking , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Ronald Davis,et al.  Neural networks and deep learning , 2017 .

[16]  Ming-Hsuan Yang,et al.  Object Tracking Benchmark , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Stan Sclaroff,et al.  MEEM: Robust Tracking via Multiple Experts Using Entropy Minimization , 2014, ECCV.

[18]  Yi Li,et al.  DeepTrack: Learning Discriminative Feature Representations Online for Robust Visual Tracking , 2015, IEEE Transactions on Image Processing.

[19]  Richard G. Baraniuk,et al.  Fast Alternating Direction Optimization Methods , 2014, SIAM J. Imaging Sci..

[20]  Luca Bertinetto,et al.  Staple: Complementary Learners for Real-Time Tracking , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Yi Wu,et al.  Online Object Tracking: A Benchmark , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Martin Lauer,et al.  Tracking Objects with Severe Occlusion by Adaptive Part Filter Modeling - In Traffic Scenes and Beyond , 2018, IEEE Intelligent Transportation Systems Magazine.

[24]  Rui Caseiro,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence High-speed Tracking with Kernelized Correlation Filters , 2022 .

[25]  Zhe Chen,et al.  MUlti-Store Tracker (MUSTer): A cognitive psychology inspired approach to object tracking , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Luca Bertinetto,et al.  Fully-Convolutional Siamese Networks for Object Tracking , 2016, ECCV Workshops.

[27]  Silvio Savarese,et al.  Learning to Track at 100 FPS with Deep Regression Networks , 2016, ECCV.

[28]  Michael Felsberg,et al.  Convolutional Features for Correlation Filter Based Visual Tracking , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[29]  Yulong Xu,et al.  Robust Scale Adaptive Kernel Correlation Filter Tracker With Hierarchical Convolutional Features , 2016, IEEE Signal Processing Letters.

[30]  Dit-Yan Yeung,et al.  Learning a Deep Compact Image Representation for Visual Tracking , 2013, NIPS.

[31]  Luca Bertinetto,et al.  End-to-End Representation Learning for Correlation Filter Based Tracking , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[33]  Jiann-Der Lee,et al.  Classification and tracking of large vehicles for night driving , 2016, 2016 IEEE 5th Global Conference on Consumer Electronics.

[34]  Arnold W. M. Smeulders,et al.  UvA-DARE (Digital Academic Repository) Siamese Instance Search for Tracking , 2016 .

[35]  Ming-Hsuan Yang,et al.  Hierarchical Convolutional Features for Visual Tracking , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[36]  Ming-Hsuan Yang,et al.  Incremental Learning for Robust Visual Tracking , 2008, International Journal of Computer Vision.

[37]  Yang Li,et al.  CFNN: Correlation Filter Neural Network for Visual Object Tracking , 2017, IJCAI.