Metric recovery from directed unweighted graphs

We analyze directed, unweighted graphs obtained from $x_i\in \mathbb{R}^d$ by connecting vertex $i$ to $j$ iff $|x_i - x_j| < \epsilon(x_i)$. Examples of such graphs include $k$-nearest neighbor graphs, where $\epsilon(x_i)$ varies from point to point, and, arguably, many real world graphs such as co-purchasing graphs. We ask whether we can recover the underlying Euclidean metric $\epsilon(x_i)$ and the associated density $p(x_i)$ given only the directed graph and $d$. We show that consistent recovery is possible up to isometric scaling when the vertex degree is at least $\omega(n^{2/(2+d)}\log(n)^{d/(d+2)})$. Our estimator is based on a careful characterization of a random walk over the directed graph and the associated continuum limit. As an algorithm, it resembles the PageRank centrality metric. We demonstrate empirically that the estimator performs well on simulated examples as well as on real-world co-purchasing graphs even with a small number of points and degree scaling as low as $\log(n)$.

[1]  S. Sharma,et al.  The Fokker-Planck Equation , 2010 .

[2]  Wolfgang Woess,et al.  Random Walks on Infinite Graphs and Groups — a Survey on Selected topics , 1994 .

[3]  S. Ethier,et al.  Markov Processes: Characterization and Convergence , 2005 .

[4]  Ulrike von Luxburg,et al.  Shortest path distance in random k-nearest neighbor graphs , 2012, ICML.

[5]  Gábor Csárdi,et al.  The igraph software package for complex network research , 2006 .

[6]  Jonathan J. Hull,et al.  A Database for Handwritten Text Recognition Research , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  S. Varadhan,et al.  Diffusion processes with boundary conditions , 1971 .

[8]  P. Jaccard,et al.  Etude comparative de la distribution florale dans une portion des Alpes et des Jura , 1901 .

[9]  Ulrike von Luxburg,et al.  Density estimation from unweighted k-nearest neighbor graphs: a roadmap , 2013, NIPS.

[10]  Ulrike von Luxburg,et al.  Graph Laplacians and their Convergence on Random Neighborhood Graphs , 2006, J. Mach. Learn. Res..

[11]  Ulrike von Luxburg,et al.  Density-preserving quantization with application to graph downsampling , 2014, COLT.

[12]  L. Devroye,et al.  The Strong Uniform Consistency of Nearest Neighbor Density Estimates. , 1977 .

[13]  Jure Leskovec,et al.  The dynamics of viral marketing , 2005, EC '06.

[14]  Ling Huang,et al.  An Analysis of the Convergence of Graph Laplacians , 2010, ICML.

[15]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[16]  Ward Whitt,et al.  Some Useful Functions for Functional Limit Theorems , 1980, Math. Oper. Res..