Function Learning from Interpolation

In the problem of learning a real-valued function from examples, in an extension of the ‘PAC’ model, a learner sees a sequence of values of an unknown function at a number of randomly chosen points. On the basis of these examples, the learner chooses a function—called a hypothesis—from some class H of hypotheses, with the aim that the learner's hypothesis is close to the target function on future random examples. In this paper we require that, for most training samples, with high probability the absolute difference between the values of the learner's hypothesis and the target function on a random point is small. A natural learning algorithm to consider is one that chooses a function in H that is close to the target function on the training examples. This, together with the success criterion described above, leads to the definition of a statistical property which we would wish a class of functions to possess.

[1]  A. Kolmogorov,et al.  Entropy and "-capacity of sets in func-tional spaces , 1961 .

[2]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[3]  Leslie G. Valiant,et al.  Fast probabilistic algorithms for hamiltonian circuits and matchings , 1977, STOC '77.

[4]  D. Pollard Convergence of stochastic processes , 1984 .

[5]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, CACM.

[6]  Leslie G. Valiant,et al.  A general lower bound on the number of examples needed for learning , 1988, COLT '88.

[7]  Colin McDiarmid,et al.  Surveys in Combinatorics, 1989: On the method of bounded differences , 1989 .

[8]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[9]  R. Dudley,et al.  Uniform and universal Glivenko-Cantelli classes , 1991 .

[10]  Martin Anthony,et al.  Computational learning theory: an introduction , 1992 .

[11]  David Haussler,et al.  Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications , 1992, Inf. Comput..

[12]  Martin Anthony,et al.  Valid generalisation of functions from close approximations on a sample , 1994 .

[13]  Balas K. Natarajan,et al.  Occam's razor for functions , 1993, COLT '93.

[14]  Hans Ulrich Simon,et al.  Bounds on the Number of Examples Needed for Learning Functions , 1994, SIAM J. Comput..

[15]  Robert E. Schapire,et al.  Efficient distribution-free learning of probabilistic concepts , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[16]  Philip M. Long,et al.  Fat-shattering and the learnability of real-valued functions , 1994, COLT '94.

[17]  Philip M. Long,et al.  Characterizations of Learnability for Classes of {0, ..., n}-Valued Functions , 1995, J. Comput. Syst. Sci..

[18]  Leonid Gurvits,et al.  Approximation and Learning of Convex Superpositions , 1995, J. Comput. Syst. Sci..

[19]  Yuval Ishai,et al.  Valid Generalisation from Approximate Interpolation , 1996, Combinatorics, Probability and Computing.

[20]  Noga Alon,et al.  Scale-sensitive dimensions, uniform convergence, and learnability , 1997, JACM.