Discriminant Adaptive Nearest Neighbor Classification

Nearest neighbor classification expects the class conditional probabilities to be locally constant, and suffers from bias in high dimensions We propose a locally adaptive form of nearest neighbor classification to try to finesse this curse of dimensionality. We use a local linear discriminant analysis to estimate an effective metric for computing neighborhoods. We determine the local decision boundaries from centroid information, and then shrink neighborhoods in directions orthogonal to these local decision boundaries, and elongate them parallel to the boundaries. Thereafter, any neighborhood-based classifier can be employed, using the modified neighborhoods. The posterior probabilities tend to be more homogeneous in the modified neighborhoods. We also propose a method for global dimension reduction, that combines local dimension information. In a number of examples, the methods demonstrate the potential for substantial improvements over nearest neighbour classification.

[1]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[2]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[3]  W. Cleveland Robust Locally Weighted Regression and Smoothing Scatterplots , 1979 .

[4]  Keinosuke Fukunaga,et al.  The optimal distance measure for nearest neighbor classification , 1981, IEEE Trans. Inf. Theory.

[5]  David J. Hand,et al.  The multi-class metric problem in nearest neighbour discrimination rules , 1990, Pattern Recognit..

[6]  Ker-Chau Li,et al.  Slicing Regression: A Link-Free Regression Method , 1991 .

[7]  G. McLachlan Discriminant Analysis and Statistical Pattern Recognition , 1992 .

[8]  Yann LeCun,et al.  Efficient Pattern Recognition Using a New Transformation Distance , 1992, NIPS.

[9]  Brian D. Ripley,et al.  Neural Networks and Related Methods for Classification , 1994 .

[10]  Patrice Y. Simard,et al.  Learning Prototype Models for Tangent Distance , 1994, NIPS.

[11]  T. Hastie Discussion: The Use of Polynomial Splines and their Tensor Products in Multivariate Function Estimation , 1994 .

[12]  Jerome H. Friedman,et al.  Flexible Metric Nearest Neighbor Classification , 1994 .

[13]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[14]  David G. Lowe,et al.  Similarity Metric Learning for a Variable-Kernel Classifier , 1995, Neural Computation.

[15]  R. Tibshirani,et al.  Penalized Discriminant Analysis , 1995 .

[16]  J. Wellner,et al.  Rates of Convergence , 1996 .