论文信息 - A Connection Between GRBF and MLP

A Connection Between GRBF and MLP

Both multilayer perceptrons (MLP) and Generalized Radial Basis Functions GRBF) have good approximation properties, theoretically and experimentally. Are they related? The main point of this paper 'is to show that for normalized inputs, multilayer perceptron networks are radial function networks albeit wth a non-standard radial function). This provides an interpretation of the weights u7 as centers t of the radial function network, and therefore as equivalent to templates. This 'Insight may be useful for practical applications, ncluding better 'Initialization procedures for MLP. In the remainder of the paper, we discuss the relation between the radial functions that correspond to the sigmoid for normalized inputs and well-behaved radial basis functions, such as the Gaussian. In particular, we observe that the radial function associated with the sigmoid 'is an activation function that is good approximation to Gaussian basis functions for a range of values of the bias parameter. The mplication is that a MLP network can always simulate a Gaussian GRBF network (with the same nmber of uits but less parameters); the converse is true oly for certain values of the bias parameter. Numerical experiments indicate that this constraint 'is not always satisfied in practice by MLP networks trained with backpropagation. Multiscale RBF networks, on the other hand, can approximate MLP networks with a smilar number of parameters.

F. Girosi | T. Poggio | M. Maruyama

[1] L. Goddard. Approximation of Functions , 1965, Nature.

[2] J. Rice,et al. Approximation from a curve of functions , 1967 .

[3] J. Friedman,et al. Projection Pursuit Regression , 1981 .

[4] L. Schumaker. Spline Functions: Basic Theory , 1981 .

[5] D. Braess. Nonlinear Approximation Theory , 1986 .

[6] R. P. Gosselin. Nonlinear approximation using rapidly increasing functions , 1986 .

[7] C. Micchelli. Interpolation of scattered data: Distance matrices and conditionally positive definite functions , 1986 .

[8] Robin Sibson,et al. What is projection pursuit , 1987 .

[9] M. J. D. Powell,et al. Radial basis functions for multivariable interpolation: a review , 1987 .

[10] Terrence J. Sejnowski,et al. Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..

[11] John Moody,et al. Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[12] Ken-ichi Funahashi,et al. On the approximate realization of continuous mappings by neural networks , 1989, Neural Networks.

[13] B. Yandell. Spline smoothing and nonparametric regression , 1989 .

[14] I. Johnstone,et al. Projection-Based Approximation and a Duality with Kernel Methods , 1989 .

[15] S. M. Carroll,et al. Construction of neural nets using the radon transform , 1989, International 1989 Joint Conference on Neural Networks.

[16] H. White,et al. Universal approximation using feedforward networks with non-sigmoid hidden layer activation functions , 1989, International 1989 Joint Conference on Neural Networks.

[17] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[18] Robert Hecht-Nielsen,et al. Theory of the backpropagation neural network , 1989, International 1989 Joint Conference on Neural Networks.

[19] F. Girosi,et al. Networks for approximation and learning , 1990, Proc. IEEE.

[20] Barbara Moore,et al. Theory of networks for learning , 1990, Defense, Security, and Sensing.

[21] T Poggio,et al. Regularization Algorithms for Learning That Are Equivalent to Multilayer Networks , 1990, Science.

[22] G. Wahba. Spline models for observational data , 1990 .

[23] Tomaso A. Poggio,et al. Extensions of a Theory of Networks for Approximation and Learning , 1990, NIPS.

[24] Tomaso Poggio,et al. HyperBF: a powerful approximation technique for learning , 1991 .

[25] George Cybenko,et al. Approximation by superpositions of a sigmoidal function , 1992, Math. Control. Signals Syst..