论文信息 - Vapnik-Chervonenkis Dimension of Recurrent Neural Networks

Vapnik-Chervonenkis Dimension of Recurrent Neural Networks

Most of the work on the Vapnik-Chervonenkis dimension of neural networks has been focused on feedforward networks. However, recurrent networks are also widely used in learning applications, in particular when time is a relevant parameter. This paper provides lower and upper bounds for the VC dimension of such networks. Several types of activation functions are discussed, including threshold, polynomial, piecewise-polynomial and sigmoidal functions. The bounds depend on two independent parameters: the number w of weights in the network, and the length k of the input sequence. In contrast, for feedforward networks, VC dimension bounds can be expressed as a function of w only. An important difference between recurrent and feedforward nets is that a fixed recurrent net can receive inputs of arbitrary length. Therefore we are particularly interested in the case k≫w. Ignoring multiplicative constants, the main results say roughly the following: For architectures with activation σ = any fixed nonlinear polynomial, the VC dimension is ≈ wk. For architectures with activation σ = any fixed piecewise polynomial, the VC dimension is between wk and w2k. For architectures with activation σ = H (threshold nets), the VC dimension is between w log(k/w) and min{wk log wk, w2+w log wk}. For the standard sigmoid σ(x)=1/(1+e−x), the VC dimension is between wk and w4k2.

Eduardo D. Sontag | Pascal Koiran

[1] S. Grossberg,et al. The Adaptive Brain , 1990 .

[2] Eduardo D. Sontag,et al. Neural Networks with Quadratic VC Dimension , 1995, J. Comput. Syst. Sci..

[3] Paul W. Goldberg,et al. Bounding the Vapnik-Chervonenkis Dimension of Concept Classes Parameterized by Real Numbers , 1993, COLT '93.

[4] Barak A. Pearlmutter,et al. VC Dimension of an Integrate-and-Fire Neuron Model , 1996, Neural Comput..

[5] C. Lee Giles,et al. Higher Order Recurrent Networks and Grammatical Inference , 1989, NIPS.

[6] Hava T. Siegelmann,et al. Analog computation via neural networks , 1993, [1993] The 2nd Israel Symposium on Theory and Computing Systems.

[7] Eduardo D. Sontag,et al. NEURAL NETS AS SYSTEMS MODELS AND CONTROLLERS , 1992 .

[8] Eduardo D. Sontag,et al. Mathematical Control Theory: Deterministic Finite Dimensional Systems , 1990 .

[9] David Haussler,et al. Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[10] Yoshua Bengio,et al. Neural networks for speech and sequence recognition , 1996 .

[11] Eduardo D. Sontag,et al. Sample complexity for learning recurrent perceptron mappings , 1995, IEEE Trans. Inf. Theory.

[12] Hava T. Siegelmann,et al. On the Computational Power of Neural Nets , 1995, J. Comput. Syst. Sci..

[13] David Haussler,et al. What Size Net Gives Valid Generalization? , 1989, Neural Computation.

[14] Eduardo D. Sontag,et al. Feedforward Nets for Interpolation and Classification , 1992, J. Comput. Syst. Sci..

[15] Marek Karpinski,et al. Polynomial Bounds for VC Dimension of Sigmoidal and General Pfaffian Neural Networks , 1997, J. Comput. Syst. Sci..

[16] J J Hopfield,et al. Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[17] Marek Karpinski,et al. Polynomial bounds for VC dimension of sigmoidal neural networks , 1995, STOC '95.