Single-Camera and Inter-Camera Vehicle Tracking and 3D Speed Estimation Based on Fusion of Visual and Semantic Features

Tracking of vehicles across multiple cameras with nonoverlapping views has been a challenging task for the intelligent transportation system (ITS). It is mainly because of high similarity among vehicle models, frequent occlusion, large variation in different viewing perspectives and low video resolution. In this work, we propose a fusion of visual and semantic features for both single-camera tracking (SCT) and inter-camera tracking (ICT). Specifically, a histogram-based adaptive appearance model is introduced to learn long-term history of visual features for each vehicle target. Besides, semantic features including trajectory smoothness, velocity change and temporal information are incorporated into a bottom-up clustering strategy for data association in each single camera view. Across different camera views, we also exploit other information, such as deep learning features, detected license plate features and detected car types, for vehicle re-identification. Additionally, evolutionary optimization is applied to camera calibration for reliable 3D speed estimation. Our algorithm achieves the top performance in both 3D speed estimation and vehicle re-identification at the NVIDIA AI City Challenge 2018.

[1]  Jenq-Neng Hwang,et al.  Tracking Human Under Occlusion Based on Adaptive Multiple Kernels With Projected Gradients , 2013, IEEE Transactions on Multimedia.

[2]  Ioannis Anagnostopoulos,et al.  License Plate Recognition From Still Images and Video Sequences: A Survey , 2008, IEEE Transactions on Intelligent Transportation Systems.

[3]  B. Caprile,et al.  Using vanishing points for camera calibration , 1990, International Journal of Computer Vision.

[4]  Jenq-Neng Hwang,et al.  Camera self-calibration from tracking of moving persons , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[5]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Konrad Schindler,et al.  Multi-Target Tracking by Discrete-Continuous Energy Minimization , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Jenq-Neng Hwang,et al.  Adaptive ground plane estimation for moving camera-based 3D object tracking , 2017, 2017 IEEE 19th International Workshop on Multimedia Signal Processing (MMSP).

[8]  J. A. Lozano,et al.  Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation , 2001 .

[9]  Ming-Hsuan Yang,et al.  UA-DETRAC: A new benchmark and protocol for multi-object detection and tracking , 2015, Comput. Vis. Image Underst..

[10]  Xiaoou Tang,et al.  A large-scale car dataset for fine-grained categorization and verification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Jenq-Neng Hwang,et al.  Joint Multi-View People Tracking and Pose Estimation for 3D Scene Reconstruction , 2018, 2018 IEEE International Conference on Multimedia and Expo (ICME).

[12]  Konrad Schindler,et al.  Continuous Energy Minimization for Multitarget Tracking , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Wu Liu,et al.  Large-scale vehicle re-identification in urban surveillance videos , 2016, 2016 IEEE International Conference on Multimedia and Expo (ICME).

[14]  Jenq-Neng Hwang,et al.  Multiple-kernel adaptive segmentation and tracking (MAST) for robust object tracking , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Adam Herout,et al.  BoxCars: Improving Fine-Grained Recognition of Vehicles Using 3-D Bounding Boxes in Traffic Surveillance , 2017, IEEE Transactions on Intelligent Transportation Systems.

[17]  Jenq-Neng Hwang,et al.  Multiple-Kernel Based Vehicle Tracking Using 3D Deformable Model and Camera Self-Calibration , 2017, ArXiv.