The Perceptron Algorithm is Fast for Nonmalicious Distributions

Within the context of Valiant's protocol for learning, the perceptron algorithm is shown to learn an arbitrary half-space in time O(n2/∊3) if D, the probability distribution of examples, is taken uniform over the unit sphere Sn. Here ∊ is the accuracy parameter. This is surprisingly fast, as standard approaches involve solution of a linear programming problem involving (n/∊) constraints in n dimensions. A modification of Valiant's distribution-independent protocol for learning is proposed in which the distribution and the function to be learned may be chosen by adversaries, however these adversaries may not communicate. It is argued that this definition is more reasonable and applicable to real world learning than Valiant's. Under this definition, the perceptron algorithm is shown to be a distribution-independent learning algorithm. In an appendix we show that, for uniform distributions, some classes of infinite V-C dimension including convex sets and a class of nested differences of convex sets are learnable.