Statistical Learning Theory

This chapter presents an overview of statistical learning theory, and describes key results regarding uniform convergence of empirical means and related sample complexity. This theory provides a fundamental extension of the probability inequalities studied in Chap. 8 to the case when parameterized families of functions are considered, instead of a fixed function. The chapter formally studies the UCEM (uniform convergence of empirical means) property and the VC dimension in the context of the Vapnik–Chervonenkis theory. Extensions to the Pollard theory for continuous-valued functions are also discussed.

[1]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[2]  R. Dudley Central Limit Theorems for Empirical Measures , 1978 .

[3]  Mathukumalli Vidyasagar,et al.  Learning and Generalization: With Applications to Neural Networks , 2002 .

[4]  Norbert Sauer,et al.  On the Density of Families of Sets , 1972, J. Comb. Theory A.

[5]  R. Dudley Balls in Rk do not cut all subsets of k + 2 points , 1979 .

[6]  J. Parrondo,et al.  Vapnik-Chervonenkis bounds for generalization , 1993 .

[7]  Mathukumalli Vidyasagar,et al.  Randomized algorithms for robust controller synthesis using statistical learning theory , 2001, Autom..

[8]  Marek Karpinski,et al.  Polynomial Bounds for VC Dimension of Sigmoidal and General Pfaffian Neural Networks , 1997, J. Comput. Syst. Sci..

[9]  Eduardo D. Sontag,et al.  Finiteness results for sigmoidal “neural” networks , 1993, STOC.

[10]  R. Dudley A course on empirical processes , 1984 .

[11]  Thomas M. Cover,et al.  Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition , 1965, IEEE Trans. Electron. Comput..

[12]  R. Dudley,et al.  Uniform Central Limit Theorems: Notation Index , 2014 .

[13]  Richard M. Dudley,et al.  Some special vapnik-chervonenkis classes , 1981, Discret. Math..

[14]  Eduardo Sontag VC dimension of neural networks , 1998 .

[15]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[16]  David Haussler,et al.  Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications , 1992, Inf. Comput..

[17]  D. Pollard Convergence of stochastic processes , 1984 .

[18]  Eduardo F. Camacho,et al.  Randomized Strategies for Probabilistic Solutions of Uncertain Feasibility and Optimization Problems , 2009, IEEE Transactions on Automatic Control.

[19]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .