Neural Networks: A Statistician's (Possible) View

Within the past few years, neural networks (NNs) have emerged as a popular, rather general-purpose means of data processing and analysis. As in most applications they are employed to perform rather standard statistical tasks like regression analysis and classification, one might wonder what is really new about them. We shed some light on this issue from a statistician’s point of view by “translating” neural network terminology into more familiar terms and then discussing some of their most important properties. Particular attention is given to “supervised” classification, i.e., discriminant analysis.

[1]  Kurt Hornik,et al.  Some new results on neural network approximation , 1993, Neural Networks.

[2]  Brian D. Ripley,et al.  Pattern Recognition and Neural Networks , 1996 .

[3]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[4]  Jehoshua Bruck,et al.  Polynomial threshold functions, AC functions and spectrum norms , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[5]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1992, Math. Control. Signals Syst..

[6]  David J. C. MacKay,et al.  A Practical Bayesian Framework for Backpropagation Networks , 1992, Neural Computation.

[7]  J. Stephen Judd,et al.  Neural network design and the complexity of learning , 1990, Neural network modeling and connectionism.

[8]  Paul W. Goldberg,et al.  Bounding the Vapnik-Chervonenkis Dimension of Concept Classes Parameterized by Real Numbers , 1993, COLT '93.

[9]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[10]  Ken-ichi Funahashi,et al.  On the approximate realization of continuous mappings by neural networks , 1989, Neural Networks.

[11]  Marek Karpinski,et al.  Polynomial bounds for VC dimension of sigmoidal neural networks , 1995, STOC '95.

[12]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[13]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[14]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[15]  David J. C. MacKay,et al.  The Evidence Framework Applied to Classification Networks , 1992, Neural Computation.

[16]  Teuvo Kohonen,et al.  Self-organization and associative memory: 3rd edition , 1989 .

[17]  Andrew R. Barron,et al.  Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.

[18]  Hava T. Siegelmann,et al.  Analog computation via neural networks , 1993, [1993] The 2nd Israel Symposium on Theory and Computing Systems.

[19]  J. Rissanen Stochastic Complexity and Modeling , 1986 .

[20]  Kurt Hornik,et al.  Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.

[21]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[22]  Jehoshua Bruck,et al.  Harmonic Analysis of Polynomial Threshold Functions , 1990, SIAM J. Discret. Math..

[23]  Brian D. Ripley,et al.  Statistical aspects of neural networks , 1993 .

[24]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[25]  Geoffrey E. Hinton,et al.  Learning representations by back-propagation errors, nature , 1986 .

[26]  Kurt Hornik,et al.  Learning in linear neural networks: a survey , 1995, IEEE Trans. Neural Networks.

[27]  David J. C. MacKay,et al.  Bayesian Interpolation , 1992, Neural Computation.

[28]  Hava T. Siegelmann,et al.  On the Computational Power of Neural Nets , 1995, J. Comput. Syst. Sci..

[29]  Eduardo D. Sontag,et al.  Feedforward Nets for Interpolation and Classification , 1992, J. Comput. Syst. Sci..

[30]  Teuvo Kohonen,et al.  Self-Organization and Associative Memory , 1988 .

[31]  Shun-ichi Amari,et al.  Network information criterion-determining the number of hidden units for an artificial neural network model , 1994, IEEE Trans. Neural Networks.