Local Intrinsic Dimension Estimation by Generalized Linear Modeling

We propose a method for intrinsic dimension estimation. By fitting the power of distance from an inspection point and the number of samples included inside a ball with a radius equal to the distance, to a regression model, we estimate the goodness of fit. Then, by using the maximum likelihood method, we estimate the local intrinsic dimension around the inspection point. The proposed method is shown to be comparable to conventional methods in global intrinsic dimension estimation experiments. Furthermore, we experimentally show that the proposed method outperforms a conventional local dimension estimation method.

[1]  Gerald Sommer,et al.  Intrinsic Dimensionality Estimation With Optimally Topology Preserving Maps , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Hisashi Kashima,et al.  Generalized Expansion Dimension , 2012, 2012 IEEE 12th International Conference on Data Mining Workshops.

[3]  Nanda Kambhatla,et al.  Dimension Reduction by Local Principal Component Analysis , 1997, Neural Computation.

[4]  Thomas S. Huang,et al.  Regularized Maximum Likelihood for Intrinsic Dimension Estimation , 2010, UAI.

[5]  Michel Verleysen,et al.  Nonlinear Dimensionality Reduction , 2021, Computer Vision.

[6]  Erkki Oja,et al.  Subspace methods of pattern recognition , 1983 .

[7]  H. G. E. Hentschel,et al.  The infinite number of generalized dimensions of fractals and strange attractors , 1983 .

[8]  Alfred O. Hero,et al.  Ensemble weighted kernel estimators for multivariate entropy estimation , 2012, NIPS.

[9]  Alessandro Rozza,et al.  Novel high intrinsic dimensionality estimators , 2012, Machine Learning.

[10]  Kenneth Falconer,et al.  Fractal Geometry: Mathematical Foundations and Applications , 1990 .

[11]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[12]  Michael E. Houle,et al.  Dimensionality, Discriminability, Density and Distance Distributions , 2013, 2013 IEEE 13th International Conference on Data Mining Workshops.

[13]  C. A. R. Hoare,et al.  Algorithm 64: Quicksort , 1961, Commun. ACM.

[14]  Xiao-Li Meng,et al.  Using EM to Obtain Asymptotic Variance-Covariance Matrices: The SEM Algorithm , 1991 .

[15]  Imre Csiszár,et al.  Information Theory and Statistics: A Tutorial , 2004, Found. Trends Commun. Inf. Theory.

[16]  Balázs Kégl,et al.  Intrinsic Dimension Estimation Using Packing Numbers , 2002, NIPS.

[17]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[18]  R. Cook,et al.  Theory & Methods: Special Invited Paper: Dimension Reduction and Visualization in Discriminant Analysis (with discussion) , 2001 .

[19]  Eric R. Ziegel,et al.  An Introduction to Generalized Linear Models , 2002, Technometrics.

[20]  Bo Zhang,et al.  Intrinsic dimension estimation of manifolds by incising balls , 2009, Pattern Recognit..

[21]  Mark Crovella,et al.  Estimating intrinsic dimension via clustering , 2012, 2012 IEEE Statistical Signal Processing Workshop (SSP).

[22]  Yousef Saad,et al.  Orthogonal Neighborhood Preserving Projections: A Projection-Based Dimensionality Reduction Technique , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Yunqian Ma,et al.  Manifold Learning Theory and Applications , 2011 .

[24]  Ken-ichi Kawarabayashi,et al.  Estimating Local Intrinsic Dimensionality , 2015, KDD.

[25]  A. Rényi On the dimension and entropy of probability distributions , 1959 .

[26]  Alfred O. Hero,et al.  Geodesic entropic graphs for dimension and entropy estimation in manifold learning , 2004, IEEE Transactions on Signal Processing.

[27]  Matthew Brand,et al.  Charting a Manifold , 2002, NIPS.

[28]  Robert P. W. Duin,et al.  An Evaluation of Intrinsic Dimensionality Estimators , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Aman Ullah,et al.  The second-order bias and mean squared error of nonlinear estimators , 1996 .

[30]  Alfred O. Hero,et al.  Multivariate f-divergence Estimation With Confidence , 2014, NIPS.

[31]  Y. Pesin On rigorous mathematical definitions of correlation dimension and generalized spectrum for dimensions , 1993 .

[32]  Peter J. Bickel,et al.  Maximum Likelihood Estimation of Intrinsic Dimension , 2004, NIPS.

[33]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[34]  S. M. Ali,et al.  A General Class of Coefficients of Divergence of One Distribution from Another , 1966 .

[36]  Keinosuke Fukunaga,et al.  An Algorithm for Finding Intrinsic Dimensionality of Data , 1971, IEEE Transactions on Computers.

[37]  Csaba Szepesvári,et al.  Manifold-Adaptive Dimension Estimation , 2007, ICML '07.

[38]  Matthias Hein,et al.  Intrinsic dimensionality estimation of submanifolds in Rd , 2005, ICML.

[39]  David R. Karger,et al.  Finding nearest neighbors in growth-restricted metrics , 2002, STOC '02.

[40]  L. Devroye,et al.  A weighted k-nearest neighbor density estimate for geometric inference , 2011 .

[41]  Anil K. Jain,et al.  An Intrinsic Dimensionality Estimator from Near-Neighbor Information , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  P. Grassberger,et al.  Measuring the Strangeness of Strange Attractors , 1983 .