A counterexample to Tomek's consistency theorem for a condensed nearest neighbor decision rule

Abstract The condensed nearest neighbor rule (CNN) was proposed by Hart (1968) as a method to reduce the storage requirements of the original data set D for the efficient implementation of the nearest neighbor decision rule in pattern classification problems. Tomek (1976a) suggested two modifications of CNN in order to improve its performance. As a first step in Tomek's second method he computes a subset C of D, for subsequent use in CNN, and claims that C is training-set-consistent, i.e., that all data points in D are correctly classified by the nearest neighbor rule using C. In this note we provide a counterexample to this claim. We also analyze Tomek's algorithm in the context of more recent graph-theoretical condensing schemes.

[1]  C. W. Swonger SAMPLE SET CONDENSATION FOR A CONDENSED NEAREST NEIGHBOR DECISION RULE FOR PATTERN RECOGNITION , 1972 .

[2]  Franz Aurenhammer,et al.  Voronoi diagrams—a survey of a fundamental geometric data structure , 1991, CSUR.

[3]  I. Tomek,et al.  Two Modifications of CNN , 1976 .

[4]  Chin-Liang Chang,et al.  Finding Prototypes For Nearest Neighbor Classifiers , 1974, IEEE Transactions on Computers.

[5]  G. Gates,et al.  The reduced nearest neighbor rule (Corresp.) , 1972, IEEE Trans. Inf. Theory.

[6]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[7]  Godfried T. Toussaint,et al.  Relative neighborhood graphs and their relatives , 1992, Proc. IEEE.

[8]  Rolf Klein,et al.  Concrete and Abstract Voronoi Diagrams , 1990, Lecture Notes in Computer Science.

[9]  R. Sokal,et al.  A New Statistical Approach to Geographic Variation Analysis , 1969 .

[10]  Gordon T. Wilfong Nearest neighbor problems , 1992, Int. J. Comput. Geom. Appl..

[11]  AurenhammerFranz Voronoi diagramsa survey of a fundamental geometric data structure , 1991 .

[12]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[13]  Hugh B. Woodruff,et al.  An algorithm for a selective nearest neighbor decision rule (Corresp.) , 1975, IEEE Trans. Inf. Theory.

[14]  Peter E. Hart,et al.  The condensed nearest neighbor rule (Corresp.) , 1968, IEEE Trans. Inf. Theory.

[15]  Julian R. Ullmann Automatic selection of reference data for use in a nearest-neighbor method of pattern classification (Corresp.) , 1974, IEEE Trans. Inf. Theory.

[16]  I. Tomek An Experiment with the Edited Nearest-Neighbor Rule , 1976 .

[17]  G. Krishna,et al.  The condensed nearest neighbor rule using the concept of mutual nearest neighborhood (Corresp.) , 1979, IEEE Trans. Inf. Theory.

[18]  C. J. Stone,et al.  Consistent Nonparametric Regression , 1977 .

[19]  D. Matula,et al.  Properties of Gabriel Graphs Relevant to Geographic Variation Research and the Clustering of Points in the Plane , 2010 .

[20]  Hiroshi Imai,et al.  Voronoi Diagram in the Laguerre Geometry and its Applications , 1985, SIAM J. Comput..

[21]  K. Fukunaga,et al.  Nonparametric Data Reduction , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Luc Devroye,et al.  On the Inequality of Cover and Hart in Nearest Neighbor Discrimination , 1981, IEEE Transactions on Pattern Analysis and Machine Intelligence.