Four Types of Learning Curves

If machines are learning to make decisions given a number of examples, the generalization error (t) is defined as the average probability that an incorrect decision is made for a new example by a machine when trained with t examples. The generalization error decreases as t increases, and the curve (t) is called a learning curve. The present paper uses the Bayesian approach to show that given the annealed approximation, learning curves can be classified into four asymptotic types. If the machine is deterministic with noiseless teacher signals, then (1) at-1 when the correct machine parameter is unique, and (2) at-2 when the set of the correct parameters has a finite measure. If the teacher signals are noisy, then (3) at-1/2 for a deterministic machine, and (4) c + at-1 for a stochastic machine.

[1]  A. A. Mullin,et al.  Principles of neurodynamics , 1962 .

[2]  Shun-ichi Amari,et al.  A Theory of Adaptive Pattern Classifiers , 1967, IEEE Trans. Electron. Comput..

[3]  K. Abromeit Music Received , 2023, Notes.

[4]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[5]  Shun-ichi Amari,et al.  Differential-geometrical methods in statistics , 1985 .

[6]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[7]  J. Rissanen Stochastic Complexity and Modeling , 1986 .

[8]  David Haussler,et al.  Predicting {0,1}-functions on randomly drawn points , 1988, COLT '88.

[9]  Esther Levin,et al.  A statistical approach to learning and generalization in layered neural networks , 1989, Proc. IEEE.

[10]  David Haussler,et al.  What Size Net Gives Valid Generalization? , 1989, Neural Computation.

[11]  Halbert White,et al.  Learning in Artificial Neural Networks: A Statistical Perspective , 1989, Neural Computation.

[12]  Eric B. Baum,et al.  The Perceptron Algorithm is Fast for Nonmalicious Distributions , 1990, Neural Computation.

[13]  Shun-ichi Amari,et al.  Mathematical foundations of neurocomputing , 1990, Proc. IEEE.

[14]  Sompolinsky,et al.  Learning from examples in large neural networks. , 1990, Physical review letters.

[15]  Haim Sompolinsky,et al.  Learning from Examples in a Single-Layer Neural Network , 1990 .

[16]  Vijay K. Samalam,et al.  Exhaustive Learning , 1990, Neural Computation.

[17]  Shun-ichi Amari,et al.  Dualistic geometry of the manifold of higher-order neurons , 1991, Neural Networks.

[18]  Sompolinsky,et al.  Statistical mechanics of learning from examples. , 1992, Physical review. A, Atomic, molecular, and optical physics.

[19]  Shun-ichi Amari,et al.  Information geometry of Boltzmann machines , 1992, IEEE Trans. Neural Networks.

[20]  Yoshiyuki Kabashima,et al.  Learning Curves for Error Minimum and Maximum Likelihood Algorithms , 1992, Neural Computation.

[21]  Shun-ichi Amari,et al.  Statistical Theory of Learning Curves under Entropic Loss Criterion , 1993, Neural Computation.

[22]  H. Sebastian Seung,et al.  Learning from a Population of Hypotheses , 1993, COLT '93.

[23]  Oh,et al.  Generalization in a two-layer neural network. , 1993, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[24]  Shun-ichi Amari,et al.  A universal theorem on learning curves , 1993, Neural Networks.