End-to-End Instance Segmentation and Counting with Recurrent Attention

While convolutional neural networks have gained impressive success recently in solving structured prediction problems such as semantic segmentation, it remains a challenge to differentiate individual object instances in the scene. Instance segmentation is very important in a variety of applications, such as autonomous driving, image captioning, and visual question answering. Techniques that combine large graphical models with low-level vision have been proposed to address this problem; however, we propose an end-to-end recurrent neural network (RNN) architecture with an attention mechanism to model a human-like counting process, and produce detailed instance segmentations. The network is jointly trained to sequentially produce regions of interest as well as a dominant object segmentation within each region. The proposed model achieves state-of-the-art results on the CVPPP leaf segmentation dataset and KITTI vehicle segmentation dataset.

[1]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[2]  Jian Sun,et al.  Instance-Aware Semantic Segmentation via Multi-task Network Cascades , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[4]  Thomas Brox,et al.  Pixel-Level Encoding and Depth Layering for Instance-Level Semantic Labeling , 2016, GCPR.

[5]  Margaret Mitchell,et al.  VQA: Visual Question Answering , 2015, International Journal of Computer Vision.

[6]  Richard S. Zemel,et al.  Exploring Models and Data for Image Question Answering , 2015, NIPS.

[7]  Ramprasaath R. Selvaraju,et al.  Counting Everyday Objects in Everyday Scenes , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Ronan Collobert,et al.  Learning to Segment Object Candidates , 2015, NIPS.

[9]  Sanja Fidler,et al.  Monocular Object Instance Segmentation and Depth Ordering with CNNs , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[10]  Yunchao Wei,et al.  Proposal-Free Network for Instance-Level Object Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  한보형,et al.  Learning Deconvolution Network for Semantic Segmentation , 2015 .

[12]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Dit-Yan Yeung,et al.  Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.

[14]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Alexander C. Berg,et al.  Learning to decompose for object detection and instance segmentation , 2015, ArXiv.

[16]  Alex Graves,et al.  DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.

[17]  S. Tsaftaris,et al.  Learning to Count Leaves in Rosette Plants , 2015 .

[18]  Andrew Y. Ng,et al.  End-to-End People Detection in Crowded Scenes , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Hanno Scharr,et al.  Leaf segmentation in plant phenotyping: a collation study , 2016, Machine Vision and Applications.

[20]  Jin Chen,et al.  Multi-leaf tracking from fluorescence plant videos , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[21]  Christian Klukas,et al.  3-D Histogram-Based Segmentation and Leaf Detection for Rosette Plants , 2014, ECCV Workshops.

[22]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[23]  Andrew Zisserman,et al.  Learning To Count Objects in Images , 2010, NIPS.

[24]  Sanja Fidler,et al.  Instance-Level Segmentation with Deep Densely Connected MRFs , 2015, ArXiv.

[25]  Samy Bengio,et al.  Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.

[26]  Nathan Silberman,et al.  Instance Segmentation of Indoor Scenes Using a Coverage Loss , 2014, ECCV.

[27]  Philip H. S. Torr,et al.  Recurrent Instance Segmentation , 2015, ECCV.

[28]  Hanno Scharr,et al.  Finely-grained annotated datasets for image-based plant phenotyping , 2016, Pattern Recognit. Lett..