Instance Segmentation by Learning Pixel Neighbour Relations with a CNN

Instance segmentation is an important task in the context of automated vehicles. It allows to get precise object locations and shapes from camera images. In this paper, we propose an approach to instance segmentation based on deep learning. A convolutional neural network predicts relations between neighbour pixels. It also predicts which regions of the image lie inside instances. This representation is then decoded to get the object instances. Our approach achieves an average precision of 13.9 % over all classes on the validation dataset of the challenging Cityscapes benchmark. In contrast to other encoding based approaches, the presented approach can associate freestanding (unconnected) areas of the same instance. This problem can arise if an object is partially occluded. It is solved by learning long-range neighbour relations that do not only consider direct neighbour pixels. We show that these long-range neighbour relations result in higher accuracies for all classes.

[1]  Shu Liu,et al.  Path Aggregation Network for Instance Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]  Carsten Rother,et al.  InstanceCut: From Edges to Instances with MultiCut , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Mrinal Haloi Rethinking Convolutional Semantic Segmentation Learning , 2017, ArXiv.

[4]  Philip H. S. Torr,et al.  Pixelwise Instance Segmentation with a Dynamically Instantiated Network , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Timothy Dozat,et al.  Incorporating Nesterov Momentum into Adam , 2016 .

[6]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[7]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[8]  Min Bai,et al.  Deep Watershed Transform for Instance Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Anton van den Hengel,et al.  Wider or Deeper: Revisiting the ResNet Model for Visual Recognition , 2016, Pattern Recognit..

[10]  W. Marsden I and J , 2012 .