A Simple Weight Decay Can Improve Generalization
暂无分享,去创建一个
[1] Geoffrey E. Hinton. Learning Translation Invariant Recognition in Massively Parallel Networks , 1987, PARLE.
[2] Terrence J. Sejnowski,et al. Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..
[3] David Haussler,et al. What Size Net Gives Valid Generalization? , 1989, Neural Computation.
[4] Naftali Tishby,et al. Consistent inference of probabilities in layered networks: predictions and generalizations , 1989, International 1989 Joint Conference on Neural Networks.
[5] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.
[6] Vijay K. Samalam,et al. Exhaustive Learning , 1990, Neural Computation.
[7] David E. Rumelhart,et al. Generalization by Weight-Elimination with Application to Forecasting , 1990, NIPS.
[8] Anders Krogh,et al. Introduction to the theory of neural computation , 1994, The advanced book program.
[9] Hans Henrik Thodberg,et al. Improving Generalization of Neural Networks Through Pruning , 1991, Int. J. Neural Syst..
[10] D. Mackay,et al. A Practical Bayesian Framework for Backprop Networks , 1991 .
[11] J. Hertz,et al. Generalization in a linear perceptron in the presence of noise , 1992 .