Multiscale geometric wavelets for the analysis of point clouds

Data sets are often modeled as point clouds in ℝD, for D large. It is often assumed that the data has some interesting low-dimensional structure, for example that of a d-dimensional manifold M, with d much smaller than D. When M is simply a linear subspace, one may exploit this assumption for encoding efficiently the data by projecting onto a dictionary of d vectors in ℝD (for example found by SVD), at a cost (d + n)D for n data points. When M is nonlinear, there are no “explicit” constructions of dictionaries that achieve a similar efficiency: typically one uses either random dictionaries, or dictionaries obtained by black-box optimization. In this paper we construct data-dependent multiscale dictionaries that aim at efficient encoding and manipulating of the data. Their construction is fast, and so are the algorithms to map data points to dictionary coefficients and vice versa. In addition, data points are guaranteed to have a sparse representation in terms of the dictionary. We think of dictionaries as the analogue of wavelets, but for approximating point clouds rather than functions.

[1]  Peter W. Jones Rectifiable sets and the Traveling Salesman Problem , 1990 .

[2]  Mauro Maggioni,et al.  Multiscale Estimation of Intrinsic Dimensionality of Data Sets , 2009, AAAI Fall Symposium: Manifold Learning and Its Applications.

[3]  Michael Elad,et al.  K-SVD : DESIGN OF DICTIONARIES FOR SPARSE REPRESENTATION , 2005 .

[4]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[5]  Guillermo Sapiro,et al.  Online dictionary learning for sparse coding , 2009, ICML '09.

[6]  Michael Christ,et al.  A T(b) theorem with remarks on analytic capacity and the Cauchy integral , 1990 .

[7]  Guangliang Chen,et al.  Multiscale Geometric Methods for Data Sets II: Geometric Wavelets , 2011, ArXiv.

[8]  D. Donoho,et al.  Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Guillermo Sapiro,et al.  Discriminative k-metrics , 2009, ICML '09.

[10]  John Langford,et al.  Cover trees for nearest neighbor , 2006, ICML.

[11]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[12]  Guillermo Sapiro,et al.  Non-Parametric Bayesian Dictionary Learning for Sparse Image Representations , 2009, NIPS.

[13]  D. Donoho,et al.  Hessian Eigenmaps : new locally linear embedding techniques for high-dimensional data , 2003 .

[14]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..