Neural and statistical classifiers-taxonomy and two case studies

Pattern classification using neural networks and statistical methods is discussed. We give a tutorial overview in which popular classifiers are grouped into distinct categories according to their underlying mathematical principles; also, we assess what makes a classifier neural. The overview is complemented by two case studies using handwritten digit and phoneme data that test the performance of a number of most typical neural-network and statistical classifiers. Four methods of our own are included: reduced kernel discriminant analysis, the learning k-nearest neighbors classifier, the averaged learning subspace method, and a version of kernel discriminant analysis.

[1]  David F. Shanno,et al.  Remark on “Algorithm 500: Minimization of Unconstrained Multivariate Functions [E4]” , 1980, TOMS.

[2]  R. Tibshirani,et al.  Generalized additive models for medical research , 1986, Statistical methods in medical research.

[3]  Henry S. Baird,et al.  Recognition technology frontiers , 1993, Pattern Recognit. Lett..

[4]  Halbert White,et al.  Learning in Artificial Neural Networks: A Statistical Perspective , 1989, Neural Computation.

[5]  D. W. Scott,et al.  Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .

[6]  Patrick J. Grother,et al.  NIST Form-Based Handprint Recognition System , 1994 .

[7]  B. Silverman Density estimation for statistics and data analysis , 1986 .

[8]  L. Holmström,et al.  A new multivariate technique for top quark search , 1995 .

[9]  Ching Y. Suen,et al.  Computer recognition of unconstrained handwritten numerals , 1992, Proc. IEEE.

[10]  R. Redner,et al.  Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[11]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[12]  David J. Marchette,et al.  Adaptive mixture density estimation , 1993, Pattern Recognit..

[13]  Tzay Y. Young,et al.  Classification, Estimation and Pattern Recognition , 1974 .

[14]  J. L. Hodges,et al.  Discriminatory Analysis - Nonparametric Discrimination: Consistency Properties , 1989 .

[15]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[16]  W. Cleveland,et al.  Smoothing by Local Regression: Principles and Methods , 1996 .

[17]  R. Tibshirani,et al.  Discriminant Analysis by Gaussian Mixtures , 1996 .

[18]  K. Fukunaga,et al.  Nonparametric Data Reduction , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[20]  M. C. Jones,et al.  E. Fix and J.L. Hodges (1951): An Important Contribution to Nonparametric Discriminant Analysis and Density Estimation: Commentary on Fix and Hodges (1951) , 1989 .

[21]  Michael P. Perrone Proceedings of the NIPS'93 Postconference Workshop on Pulling It All Together: Methods for Combining Neural Networks , 1994 .

[22]  W. Pitts,et al.  A Logical Calculus of the Ideas Immanent in Nervous Activity (1943) , 2021, Ideas That Created the Future.

[23]  Chorkin Chan,et al.  A Three-Layer Adaptive Network for Pattern Density Estimation and Classification , 1991, Int. J. Neural Syst..

[24]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[25]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[26]  G. Wahba Smoothing noisy data with spline functions , 1975 .

[27]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[28]  Keinosuke Fukunaga,et al.  The Reduced Parzen Classifier , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Erkki Oja,et al.  Classification with learning k-nearest neighbors , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[30]  C. W. Therrien,et al.  Decision, Estimation and Classification: An Introduction to Pattern Recognition and Related Topics , 1989 .

[31]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[32]  Robert J. Schalkoff,et al.  Pattern recognition - statistical, structural and neural approaches , 1991 .

[33]  J. Friedman,et al.  Projection Pursuit Regression , 1981 .

[34]  Hans G. C. Tråvén,et al.  A neural network approach to statistical pattern classification by 'semiparametric' estimation of probability density functions , 1991, IEEE Trans. Neural Networks.

[35]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[36]  Peter Craven,et al.  Smoothing noisy data with spline functions , 1978 .

[37]  Erkki Oja,et al.  Self - Organizing Maps and Computer Vision , 1992 .

[38]  David J. Hand,et al.  Kernel Discriminant Analysis , 1983 .

[39]  Donald F. Specht,et al.  A general regression neural network , 1991, IEEE Trans. Neural Networks.

[40]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[41]  David W. Scott,et al.  Multivariate Density Estimation: Theory, Practice, and Visualization , 1992, Wiley Series in Probability and Statistics.

[42]  Erkki Oja,et al.  Subspace methods of pattern recognition , 1983 .

[43]  Donald F. Specht,et al.  Probabilistic neural networks , 1990, Neural Networks.

[44]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[45]  Richard Lippmann,et al.  Neural Network Classifiers Estimate Bayesian a posteriori Probabilities , 1991, Neural Computation.

[46]  Teuvo Kohonen,et al.  The 'neural' phonetic typewriter , 1988, Computer.

[47]  M. Berthod,et al.  Automatic recognition of handprinted characters—The state of the art , 1980, Proceedings of the IEEE.

[48]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[49]  Daryl Pregibon,et al.  Tree-based models , 1992 .

[50]  J. Friedman Regularized Discriminant Analysis , 1989 .

[51]  Padhraic Smyth,et al.  Fault Diagnosis of Antenna Pointing Systems Using Hybrid Neural Network and Signal Processing Models , 1991, NIPS.

[52]  Isabelle Guyon,et al.  Comparison of classifier methods: a case study in handwritten digit recognition , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[53]  Allan R. Wilks,et al.  The new S language: a programming environment for data analysis and graphics , 1988 .

[54]  Sargur N. Srihari,et al.  Recognition of handwritten and machine-printed text for postal address interpretation , 1993, Pattern Recognit. Lett..

[55]  Terrence J. Sejnowski,et al.  Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..

[56]  T. Hastie,et al.  Local Regression: Automatic Kernel Carpentry , 1993 .

[57]  Michel Gilloux Research into the new generation of character and mailing address recognition systems at the French post office research center , 1993, Pattern Recognit. Lett..

[58]  Jouko Lampinen,et al.  Distortion tolerant pattern recognition based on self-organizing feature extraction , 1995, IEEE Trans. Neural Networks.

[59]  R. Chevallier,et al.  Comparative Study of Neural Networks and Non Parametric Statistical Methods for Off-Line Handwritten Character Recognition , 1992 .

[60]  Kunihiko Fukushima,et al.  Neocognitron: A hierarchical neural network capable of visual pattern recognition , 1988, Neural Networks.

[61]  Trevor Hastie,et al.  Statistical Models in S , 1991 .

[62]  Rama Chellappa,et al.  Evaluation of pattern classifiers for fingerprint and OCR applications , 1994, Pattern Recognit..

[63]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[64]  Lasse Holmström,et al.  The self-organizing reduced kernel density estimator , 1993, IEEE International Conference on Neural Networks.

[65]  Brian D. Ripley,et al.  Neural Networks and Related Methods for Classification , 1994 .

[66]  Sargur N. Srihari,et al.  Bayesian and neural network pattern recognition: a theoretical connection and empirical results with handwritten characters , 1991 .

[67]  J. Orbach Principles of Neurodynamics. Perceptrons and the Theory of Brain Mechanisms. , 1962 .

[68]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[69]  Keinosuke Fukunaga,et al.  Statistical Pattern Recognition , 1993, Handbook of Pattern Recognition and Computer Vision.

[70]  W. Cleveland,et al.  Locally Weighted Regression: An Approach to Regression Analysis by Local Fitting , 1988 .

[71]  John A. Hertz,et al.  Exploiting Neurons with Localized Receptive Fields to Learn Chaos , 1990, Complex Syst..

[72]  G. McLachlan Discriminant Analysis and Statistical Pattern Recognition , 1992 .

[73]  Ching Y. Suen,et al.  Building a new generation of handwriting recognition systems , 1993, Pattern Recognit. Lett..

[74]  J. Mantas,et al.  An overview of character recognition methodologies , 1986, Pattern Recognit..

[75]  Ching Y. Suen,et al.  Historical review of OCR research and development , 1992, Proc. IEEE.

[76]  Petri Koistinen,et al.  Kernel regression and backpropagation training with noise , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[77]  John S. Bridle,et al.  Training Stochastic Model Recognition Algorithms as Networks can Lead to Maximum Mutual Information Estimation of Parameters , 1989, NIPS.

[78]  G. Tutz An alternative choice of smoothing for kernel-based density estimates in discrete discriminant analysis , 1986 .

[79]  William N. Venables,et al.  Modern Applied Statistics with S-Plus. , 1996 .

[80]  King-Sun Fu,et al.  Syntactic Pattern Recognition And Applications , 1968 .

[81]  Anne Guerin-dugue,et al.  Deliverable R3-B4-P-Task B4: Benchmarks , 1995 .

[82]  R. Tibshirani,et al.  Flexible Discriminant Analysis by Optimal Scoring , 1994 .

[83]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[84]  J. Freidman,et al.  Multivariate adaptive regression splines , 1991 .

[85]  Isabelle Guyon Applications of Neural Networks to Character Recognition , 1991, Int. J. Pattern Recognit. Artif. Intell..

[86]  S. Impedovo,et al.  Optical Character Recognition - a Survey , 1991, Int. J. Pattern Recognit. Artif. Intell..

[87]  D. M. Titterington,et al.  Neural Networks: A Review from a Statistical Perspective , 1994 .

[88]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[89]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[90]  M. Garris NIST form-based handprint recognition system , 1994 .

[91]  W. Highleyman Linear Decision Functions, with Application to Pattern Recognition , 1962, Proceedings of the IRE.

[92]  David J. Marchette,et al.  Adaptive mixtures: Recursive nonparametric pattern recognition , 1991, Pattern Recognit..

[93]  Toru Wakahara,et al.  Toward robust handwritten character recognition , 1993, Pattern Recognit. Lett..

[94]  J. Morgan,et al.  Problems in the Analysis of Survey Data, and a Proposal , 1963 .

[95]  Richard G. Priest,et al.  Pattern classification using projection pursuit , 1990, Pattern Recognit..

[96]  Robert P. W. Duin,et al.  A note on comparing classifiers , 1996, Pattern Recognit. Lett..