论文信息 - A Practical Bayesian Framework for Backpropagation Networks

A Practical Bayesian Framework for Backpropagation Networks

A quantitative and practical Bayesian framework is described for learning of mappings in feedforward networks. The framework makes possible (1) objective comparisons between solutions using alternative network architectures, (2) objective stopping rules for network pruning or growing procedures, (3) objective choice of magnitude and type of weight decay terms or additive regularizers (for penalizing large weights, etc.), (4) a measure of the effective number of well-determined parameters in a model, (5) quantified estimates of the error bars on network parameters and on network output, and (6) objective comparisons with alternative learning and interpolation models such as splines and radial basis functions. The Bayesian "evidence" automatically embodies "Occam's razor," penalizing overflexible and overcomplex models. The Bayesian approach helps detect poor underlying assumptions in learning models. For learning models well matched to a problem, a good correlation between generalization ability and the Bayesian evidence is obtained.

David J. C. MacKay | D. Mackay | D. MacKay

[1] Wei Tsih Lee,et al. On Optimal Adaptive Classifier Design Criterion- How many hidden units are necessary for an optimal neural network classifier? , 1991 .

[2] David J. C. MacKay,et al. Bayesian Interpolation , 1992, Neural Computation.

[3] Peter Cheeseman,et al. Bayesian classification theory , 1991 .

[4] Yaser S. Abu-Mostafa,et al. Learning from hints in neural networks , 1990, J. Complex..

[5] John Moody,et al. Note on generalization, regularization and architecture selection in nonlinear learning systems , 1991, Neural Networks for Signal Processing Proceedings of the 1991 IEEE Workshop.

[6] Geoffrey E. Hinton,et al. Learning and relearning in Boltzmann machines , 1986 .

[7] Yann LeCun,et al. Transforming Neural-Net Output Levels to Probability Distributions , 1990, NIPS.

[8] Yaser S. Abu-Mostafa,et al. The Vapnik-Chervonenkis Dimension: Information versus Complexity in Learning , 1989, Neural Computation.

[9] Chuanyi Ji,et al. Generalizing Smoothness Constraints from Discrete Samples , 1990, Neural Computation.

[10] Stephen F. Gull,et al. Developments in Maximum Entropy Data Analysis , 1989 .

[11] Fernando J. Pineda,et al. Recurrent Backpropagation and the Dynamical Approach to Adaptive Neural Computation , 1989, Neural Computation.