The Canonical Distortion Measure in Feature Space and 1-NN Classification

We prove that the Canonical Distortion Measure (CDM) [2,3] is the optimal distance measure to use for 1 nearest-neighbour (1-NN) classification, and show that it reduces to squared Euclidean distance in feature space for function classes that can be expressed as linear combinations of a fixed set of features. PAC-like bounds are given on the sample-complexity required to learn the CDM. An experiment is presented in which a neural network CDM was learnt for a Japanese OCR environment and then used to do 1-NN classification.