Transformation invariance in pattern recognition: Tangent distance and propagation

In pattern recognition, statistical modeling, or regression, the amount of data is a critical factor affecting the performance. If the amount of data and computational resources are unlimited, even trivial algorithms will converge to the optimal solution. However, in the practical case, given limited data and other resources, satisfactory performance requires sophisticated methods to regularize the problem by introducing a priori knowledge. Invariance of the output with respect to certain transformations of the input is a typical example of such a priori knowledge. We introduce the concept of tangent vectors, which compactly represent the essence of these transformation invariances, and two classes of algorithms, tangent distance and tangent propagation, which make use of these invariances to improve performance. © 2001 John Wiley & Sons, Inc. Int J Imaging Syst Technol 11, 181–197, 2000

[1]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[2]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[3]  Keinosuke Fukunaga,et al.  An Optimal Global Nearest Neighbor Metric , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Jean Voisin,et al.  An application of the multiedit-condensing technique to the reference selection problem in a print recognition system , 1987, Pattern Recognit..

[5]  Teuvo Kohonen,et al.  Self-Organization and Associative Memory , 1988 .

[6]  D. Broomhead,et al.  Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks , 1988 .

[7]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[8]  Alan J. Broder Strategies for efficient incremental nearest neighbor search , 1990, Pattern Recognition.

[9]  Trevor Hastie,et al.  A model for signature verification , 1991, Conference Proceedings 1991 IEEE International Conference on Systems, Man, and Cybernetics.

[10]  Léon Bottou,et al.  Local Learning Algorithms , 1992, Neural Computation.

[11]  Harris Drucker,et al.  Boosting Performance in Neural Networks , 1993, Int. J. Pattern Recognit. Artif. Intell..

[12]  Gordon T. Wilfong,et al.  On-Line Recognition of Handwritten Symbols , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Nuno Vasconcelos,et al.  Multiresolution Tangent Distance for Affine-invariant Classification , 1997, NIPS.

[14]  Holger Schwenk,et al.  The Diabolo Classifier , 1998, Neural Computation.

[15]  T. Hastie,et al.  Metrics and Models for Handwritten Character Recognition , 1998 .

[16]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .