Learning Sparse Representations of Depth

This paper introduces a new method for learning and inferring sparse representations of depth (disparity) maps. The proposed algorithm relaxes the usual assumption of the stationary noise model in sparse coding. This enables learning from data corrupted with spatially varying noise or uncertainty, such as that obtained by laser range scanners or structured light depth cameras. Sparse representations are learned from the Middlebury database disparity maps and then exploited in a two-layer graphical model for inferring depth from stereo, by including a sparsity prior on the learned features. Since they capture higher order dependencies in the depth structure, these priors can complement smoothness priors commonly used in depth inference based on Markov random field (MRF) models. Inference on the proposed graph is achieved using an alternating iterative optimization technique, where the first layer is solved using an existing MRF-based stereo matching algorithm, then held fixed as the second layer is solved using the proposed nonstationary sparse coding algorithm. This leads to a general method for improving solutions of state-of-the-art MRF-based depth estimation algorithms. Our experimental results first show that depth inference using learned representations leads to state-of-the-art denoising of depth maps obtained from laser range scanners and a time of flight camera. Furthermore, we show that adding sparse priors improves the results of depth estimation methods based on graph cut optimization of MRFs with first and second order priors.

[1]  Olga Veksler,et al.  Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[2]  Martin J. Wainwright,et al.  Image denoising using scale mixtures of Gaussians in the wavelet domain , 2003, IEEE Trans. Image Process..

[3]  Andrew W. Fitzgibbon,et al.  Global stereo reconstruction under second order smoothness priors , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[5]  A. Lee Swindlehurst,et al.  IEEE Journal of Selected Topics in Signal Processing Inaugural Issue: [editor-in-chief's message] , 2007, J. Sel. Topics Signal Processing.

[6]  D. Scharstein,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, Proceedings IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV 2001).

[7]  Minh N. Do,et al.  Depth and depth-color coding using shape-adaptive wavelets , 2010, J. Vis. Commun. Image Represent..

[8]  Bruno A. Olshausen,et al.  PROBABILISTIC FRAMEWORK FOR THE ADAPTATION AND COMPARISON OF IMAGE CODES , 1999 .

[9]  Michael Elad,et al.  Learning Multiscale Sparse Representations for Image and Video Restoration , 2007, Multiscale Model. Simul..

[10]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[11]  Jean-Michel Morel,et al.  A Review of Image Denoising Algorithms, with a New One , 2005, Multiscale Model. Simul..

[12]  Michael Elad,et al.  Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries , 2006, IEEE Transactions on Image Processing.

[13]  B. S. Manjunath,et al.  Improving the quality of depth image based rendering for 3D Video systems , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[14]  Vladimir Kolmogorov,et al.  An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision , 2001, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Anthony Vetro,et al.  View Synthesis for Multiview Video Compression , 2006 .

[16]  D. Nistér,et al.  Stereo Matching with Color-Weighted Correlation, Hierarchical Belief Propagation, and Occlusion Handling , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Joseph F. Murray,et al.  Dictionary Learning Algorithms for Sparse Representation , 2003, Neural Computation.

[18]  Vladimir Kolmogorov,et al.  Optimizing Binary MRFs via Extended Roof Duality , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Peter H. N. de With,et al.  Platelet-based coding of depth maps for the transmission of multiview images , 2006, Electronic Imaging.

[20]  Nanning Zheng,et al.  Stereo Matching Using Belief Propagation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Martin Vetterli,et al.  Adaptive wavelet thresholding for image denoising and compression , 2000, IEEE Trans. Image Process..

[22]  Guillermo Sapiro,et al.  Sparse Representations for Three-Dimensional Range Data Restoration , 2011 .

[23]  Vladimir Kolmogorov,et al.  What energy functions can be minimized via graph cuts? , 2002, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Yuichi Taguchi,et al.  Stereo reconstruction with mixed pixels using adaptive over-segmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Alfred Mertins,et al.  Super resolution of time-of-flight depth images under consideration of spatially varying noise variance , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[26]  Masayuki Tanimoto,et al.  Multiview Imaging and 3DTV , 2007, IEEE Signal Processing Magazine.

[27]  Andrew Blake,et al.  Fusion Moves for Markov Random Field Optimization , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Robert D. Nowak,et al.  Platelets: a multiscale approach for recovering edges and surfaces in photon-limited medical imaging , 2003, IEEE Transactions on Medical Imaging.

[29]  D Marr,et al.  Cooperative computation of stereo disparity. , 1976, Science.

[30]  M. Nikolova An Algorithm for Total Variation Minimization and Applications , 2004 .

[31]  Hiroshi Ishikawa,et al.  Higher-order clique reduction in binary graph cut , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Jian Sun,et al.  Symmetric stereo matching for occlusion handling , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[33]  L. Rudin,et al.  Nonlinear total variation based noise removal algorithms , 1992 .

[34]  Stan Z. Li,et al.  Markov Random Field Modeling in Image Analysis , 2001, Computer Science Workbench.

[35]  Rémi Gribonval,et al.  Sparse Representations in Audio and Music: From Coding to Source Separation , 2010, Proceedings of the IEEE.

[36]  Terrence J. Sejnowski,et al.  Learning Overcomplete Representations , 2000, Neural Computation.

[37]  Vladimir Kolmogorov,et al.  Computing visual correspondence with occlusions using graph cuts , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[38]  Dani Lischinski,et al.  Joint bilateral upsampling , 2007, ACM Trans. Graph..

[39]  Vladimir Kolmogorov,et al.  An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[40]  Høgskolen i Stavanger FRAME DESIGN USING FOCUSS WITH METHOD OF OPTIMAL DIRECTIONS (MOD) , 2000 .

[41]  Dale Purves,et al.  A statistical explanation of visual space , 2003, Nature Neuroscience.

[42]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[43]  Mike E. Davies,et al.  Dictionary Learning for Sparse Approximations With the Majorization Method , 2009, IEEE Transactions on Signal Processing.

[44]  Timo Schairer,et al.  Robust non-local denoising of colored depth data , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[45]  Alessandro Foi,et al.  Image Denoising by Sparse 3-D Transform-Domain Collaborative Filtering , 2007, IEEE Transactions on Image Processing.