Improved generic categorical object detection fusing depth cue with 2D appearance and shape features

We propose a novel 3D depth cue-based generic categorical object detection model, which extends our previous 2D feature-based object detection method for object detection with severe occlusions. Since the novel model integrates complementary 3D depth cue with 2D appearance and shape features, it significantly improves the detection performance and robustness of the current 2D-based object detection system. The depth cue, derived from the disparity map, is obtained via stereo matching of input image pairs. Disparity map is clustered to different layers, then appearance and shape features are extracted at each layer and matched with the learnt 2D codebooks. Finally, detection hypotheses at all layers are merged to generate the final detection result. Experimental results show that the novel 3D depth cue-based model achieves a 2.57% gain of the average recall rate over the 2D feature-based method on our collected stereo car-side dataset.