Transformation Invariance in Pattern Recognition-Tangent Distance and Tangent Propagation

In pattern recognition, statistical modeling, or regression, the amount of data is the most critical factor affecting the performance. If the amount of data and computational resources are near infinite, many algorithmes will probably converge to the optimal solution. When this is not the case, one has to introduce regularizers and a-priori knowledge to supplement the available data in order to boost the performance. Invariance (or known dependance) with respect to transformation of the input is a frequent occurrence of such an a-priori knowledge. In this chapter, we introduce the concept of tangent vectors, which compactly represent the essence of these transformation invariances, and two classes of algorithms, “Tangent distance” and ‘Tangent propagation”, which make use of these invariances to improve performance.

[1]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[2]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[3]  R. Gilmore,et al.  Lie Groups, Lie Algebras, and Some of Their Applications , 1974 .

[4]  Choquet Bruhat,et al.  Analysis, Manifolds and Physics , 1977 .

[5]  R. Sibson Studies in the Robustness of Multidimensional Scaling: Procrustes Statistics , 1978 .

[6]  Keinosuke Fukunaga,et al.  An Optimal Global Nearest Neighbor Metric , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Jean Voisin,et al.  An application of the multiedit-condensing technique to the reference selection problem in a print recognition system , 1987, Pattern Recognit..

[8]  Teuvo Kohonen,et al.  Self-Organization and Associative Memory , 1988 .

[9]  David S. Broomhead,et al.  Multivariable Functional Interpolation and Adaptive Networks , 1988, Complex Syst..

[10]  Yann LeCun,et al.  Generalization and network design strategies , 1989 .

[11]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[12]  William H. Press,et al.  Numerical recipes , 1990 .

[13]  Alan J. Broder Strategies for efficient incremental nearest neighbor search , 1990, Pattern Recognition.

[14]  Belur V. Dasarathy,et al.  Nearest neighbor (NN) norms: NN pattern classification techniques , 1991 .

[15]  Geoffrey E. Hinton,et al.  Adaptive Elastic Models for Hand-Printed Character Recognition , 1991, NIPS.

[16]  Trevor Hastie,et al.  A model for signature verification , 1991, Conference Proceedings 1991 IEEE International Conference on Systems, Man, and Cybernetics.

[17]  Léon Bottou,et al.  Local Learning Algorithms , 1992, Neural Computation.

[18]  Harris Drucker,et al.  Boosting Performance in Neural Networks , 1993, Int. J. Pattern Recognit. Artif. Intell..

[19]  Patrice Y. Simard Efficient Computation of Complex Distance Metrics Using Hierarchical Filtering , 1993, NIPS.

[20]  Harris Drucker,et al.  Learning algorithms for classification: A comparison on handwritten digit recognition , 1995 .

[21]  Gordon T. Wilfong,et al.  On-Line Recognition of Handwritten Symbols , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Nuno Vasconcelos,et al.  Multiresolution Tangent Distance for Affine-invariant Classification , 1997, NIPS.

[23]  Holger Schwenk,et al.  The Diabolo Classifier , 1998, Neural Computation.

[24]  T. Hastie,et al.  Metrics and Models for Handwritten Character Recognition , 1998 .