论文信息 - Universal property of learning curves under entropy loss

Universal property of learning curves under entropy loss

A learning curve shows how fast a learning machine improves it behaviour as the number of training examples increases. A study of the universal asymptotic behaviour of learning curves for general dichotomy machines is presented. It is proved rigorously that the average predictive entropy converges to zero as approximately d/t as the number of t of training examples increases, where d is the number of modifiable parameters of a machine, irrespectively of the architecture of the machine.<<ETX>>

S. Amari

[1] Shun-ichi Amari,et al. A Theory of Adaptive Pattern Classifiers , 1967, IEEE Trans. Electron. Comput..

[2] Leslie G. Valiant,et al. A theory of the learnable , 1984, STOC '84.

[3] J. Rissanen. Stochastic Complexity and Modeling , 1986 .

[4] David Haussler,et al. Predicting (0, 1)-functions on randomly drawn points , 1988, [Proceedings 1988] 29th Annual Symposium on Foundations of Computer Science.

[5] Esther Levin,et al. A statistical approach to learning and generalization in layered neural networks , 1989, Proc. IEEE.

[6] David Haussler,et al. What Size Net Gives Valid Generalization? , 1989, Neural Computation.

[7] Sompolinsky,et al. Learning from examples in large neural networks. , 1990, Physical review letters.

[8] Heskes,et al. Learning processes in neural networks. , 1991, Physical review. A, Atomic, molecular, and optical physics.

[9] Kenji Yamanishi. A loss bound model for on-line stochastic prediction strategies , 1991, COLT '91.

[10] H. Sebastian Seung,et al. Learning curves in large neural networks , 1991, COLT '91.

[11] David Haussler,et al. Calculation of the learning curve of Bayes optimal classification algorithm for learning a perceptron with noise , 1991, COLT '91.

[12] Shun-ichi Amari,et al. Four Types of Learning Curves , 1992, Neural Computation.

[13] Shun-ichi Amari,et al. Statistical Theory of Learning Curves under Entropic Loss Criterion , 1993, Neural Computation.