Geometric Graphs for Improving Nearest Neighbor Decision Rules

Non-parametric decision rules are powerful because of their simplicity, good performance, and the fact that no assumptions are made about the underlying distributions of the data (see Aha [1], Devroye, Gyorfy and Lugosi [6], Duda and Hart [7], Duda, Hart and Stork [8], McLachlan [10], O’Rourke and Toussaint [11]). In this setting we have available a set of d measurements (also called a feature vector) taken from each member of a data set of n objects (patterns) denoted by {X, Y} = {(X 1, Y 1), (X 2, Y 2),..., (X n, Yn)}, where X i and Y i denote, respectively, the feature vector on the ith object and the class label of that object. One of the most attractive decision procedures, conceived by Fix and Hodges in 1951, is the nearest-neighbor rule (1-NN-rule) [9]. Let Z be a new pattern (feature vector) to be classified and let X j be the feature vector in {X, Y} = {(X 1, Y 1), (X 2, Y 2), ..., (X n, Yn)} closest to Z. The nearest neighbor decision rule classifies the pattern Z into class Y j, breaking ties randomly.

[1]  David G. Stork,et al.  Pattern Classification , 1973 .

[2]  J. L. Hodges,et al.  Discriminatory Analysis - Nonparametric Discrimination: Consistency Properties , 1989 .

[3]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.

[4]  Dennis L. Wilson,et al.  Asymptotic Properties of Nearest Neighbor Rules Using Edited Data , 1972, IEEE Trans. Syst. Man Cybern..

[5]  T. Wagner,et al.  Another Look at the Edited Nearest Neighbor Rule. , 1976 .

[6]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[7]  Luc Devroye,et al.  On the Inequality of Cover and Hart in Nearest Neighbor Discrimination , 1981, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[9]  G. McLachlan Discriminant Analysis and Statistical Pattern Recognition , 1992 .

[10]  David W. Aha,et al.  Lazy Learning , 1997, Springer Netherlands.

[11]  Terry J. Wagner Convergence of the edited nearest neighbor (Corresp.) , 1973, IEEE Trans. Inf. Theory.

[12]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[13]  David L. Waltz,et al.  Toward memory-based reasoning , 1986, CACM.