Overfitting in Neural Nets: Backpropagation, Conjugate Gradient, and Early Stopping
暂无分享,去创建一个
[1] Terrence J. Sejnowski,et al. Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..
[2] Dean Pomerleau,et al. ALVINN, an autonomous land vehicle in a neural network , 2015 .
[3] David Haussler,et al. What Size Net Gives Valid Generalization? , 1989, Neural Computation.
[4] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.
[5] John E. Moody,et al. Note on Learning Rate Schedules for Stochastic Optimization , 1990, NIPS.
[6] David E. Rumelhart,et al. Generalization by Weight-Elimination with Application to Forecasting , 1990, NIPS.
[7] Anders Krogh,et al. A Simple Weight Decay Can Improve Generalization , 1991, NIPS.
[8] James A. Pittman,et al. Recognizing Hand-Printed Letters and Digits Using Backpropagation Learning , 1991, Neural Computation.
[9] John E. Moody,et al. The Effective Number of Parameters: An Analysis of Generalization and Regularization in Nonlinear Learning Systems , 1991, NIPS.
[10] Elie Bienenstock,et al. Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.
[11] Andreas Weigend,et al. On overfitting and the effective number of hidden units , 1993 .
[12] Peter L. Bartlett,et al. For Valid Generalization the Size of the Weights is More Important than the Size of the Network , 1996, NIPS.
[13] David H. Wolpert,et al. On Bias Plus Variance , 1997, Neural Computation.