Road Scene Segmentation from a Single Image

Road scene segmentation is important in computer vision for different applications such as autonomous driving and pedestrian detection. Recovering the 3D structure of road scenes provides relevant contextual information to improve their understanding. In this paper, we use a convolutional neural network based algorithm to learn features from noisy labels to recover the 3D scene layout of a road image. The novelty of the algorithm relies on generating training labels by applying an algorithm trained on a general image dataset to classify on–board images. Further, we propose a novel texture descriptor based on a learned color plane fusion to obtain maximal uniformity in road areas. Finally, acquired (off–line) and current (on–line) information are combined to detect road areas in single images. From quantitative and qualitative experiments, conducted on publicly available datasets, it is concluded that convolutional neural networks are suitable for learning 3D scene layout from noisy labels and provides a relative improvement of 7% compared to the baseline. Furthermore, combining color planes provides a statistical description of road areas that exhibits maximal uniformity and provides a relative improvement of 8% compared to the baseline. Finally, the improvement is even bigger when acquired and current information from a single image are combined.

[1]  J. Todd Book Review: Digital image processing (second edition). By R. C. Gonzalez and P. Wintz, Addison-Wesley, 1987. 503 pp. Price: £29.95. (ISBN 0-201-11026-1) , 1988 .

[2]  William K. Pratt,et al.  Digital image processing, 2nd Edition , 1991, A Wiley-Interscience publication.

[3]  Michael A. Arbib,et al.  The handbook of brain theory and neural networks , 1995, A Bradford book.

[4]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[5]  Grouping dominant orientations for ill-structured road following , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[6]  Alexei A. Efros,et al.  Recovering Surface Layout from an Image , 2007, International Journal of Computer Vision.

[7]  Maria Petrou,et al.  Image processing - dealing with texture , 2020 .

[8]  Sebastian Thrun,et al.  Reverse Optical Flow for Self-Supervised Adaptive Autonomous Robot Navigation , 2007, International Journal of Computer Vision.

[9]  Andrew J. Davison,et al.  Active Matching , 2008, ECCV.

[10]  Roberto Cipolla,et al.  Segmentation and Recognition Using Structure from Motion Point Clouds , 2008, ECCV.

[11]  Koen E. A. van de Sande,et al.  Color Descriptors for Object Category Recognition , 2008, CGIV/MCS.

[12]  Koen E. A. van de Sande,et al.  Evaluation of color descriptors for object and scene recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Philip H. S. Torr,et al.  Combining Appearance and Structure from Motion Features for Road Scene Understanding , 2009, BMVC.

[14]  Jean Ponce,et al.  Vanishing point detection for road detection , 2009, CVPR.

[15]  Sven J. Dickinson,et al.  TurboPixels: Fast Superpixels Using Geometric Flows , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Stephen Gould,et al.  Decomposing a scene into geometric and semantically consistent regions , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[17]  Roberto Cipolla,et al.  Semantic object classes in video: A high-definition ground truth database , 2009, Pattern Recognit. Lett..

[18]  Luc Van Gool,et al.  Segmentation-Based Urban Traffic Scene Understanding , 2009, BMVC.

[19]  David Gerónimo Gómez,et al.  Survey of Pedestrian Detection for Advanced Driver Assistance Systems , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Trevor Darrell,et al.  Adapting Visual Category Models to New Domains , 2010, ECCV.

[21]  W. F. Clocksin,et al.  Joint Optimization for Object Class Segmentation and Dense Stereo Reconstruction , 2011, International Journal of Computer Vision.

[22]  Thomas Deselaers,et al.  ClassCut for Unsupervised Class Segmentation , 2010, ECCV.

[23]  Joseph F. Murray,et al.  Convolutional Networks Can Learn to Generate Affinity Graphs for Image Segmentation , 2010, Neural Computation.

[24]  Hubert Cecotti,et al.  Convolutional Neural Networks for P300 Detection with Application to Brain-Computer Interfaces , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  W. F. Clocksin,et al.  Joint Optimization for Object Class Segmentation and Dense Stereo Reconstruction , 2012, International Journal of Computer Vision.

[26]  Antonio M. López,et al.  Road Detection Based on Illuminant Invariance , 2011, IEEE Transactions on Intelligent Transportation Systems.

[27]  Trevor Darrell,et al.  What you saw is not what you get: Domain adaptation using asymmetric kernel transforms , 2011, CVPR 2011.

[28]  Ivor W. Tsang,et al.  Domain Transfer Multiple Kernel Learning , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  C. V. Jawahar,et al.  Scene Text Recognition using Higher Order Language Priors , 2009, BMVC.