论文信息 - Hybrid Stochastic-Adversarial On-line Learning

Hybrid Stochastic-Adversarial On-line Learning

Most of the research in online learning focused either on the problem of adversarial classification (i.e., both inputs and labels are arbitrarily chosen by an adversary) or on the traditional supervised learning problem in which samples are independently generated from a fixed probability distribution. Nonetheless, in a number of domains the relationship between inputs and labels may be adversarial, whereas input instances are generated according to a constant distribution. This scenario can be formalized as an hybrid classification problem in which inputs are stochastic, while labels are adversarial. In this paper, we introduce this hybrid stochastic-adversarial classification problem, we propose an online learning algorithm for its solution, and we analyze its performance. In particular, we show that, given a hypothesis space H with finite VC dimension, it is possible to incrementally build a suitable finite set of hypotheses that can be used as input for an exponentially weighted forecaster achieving a cumulative regret of order O( p nV C(H) log n) with overwhelming probability. Finally, we discuss extensions to multi-label classification, learning from experts and bandit settings with stochastic side information, and application to games.

Alessandro Lazaric | Rémi Munos | R. Munos | A. Lazaric

[1] F ROSENBLATT,et al. The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[2] N. Littlestone. Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[3] Balas K. Natarajan,et al. On learning sets and functions , 2004, Machine Learning.

[4] David Haussler,et al. How to use expert advice , 1993, STOC.

[5] Manfred K. Warmuth,et al. The Weighted Majority Algorithm , 1994, Inf. Comput..

[6] Philip M. Long,et al. Characterizations of Learnability for Classes of {0, ..., n}-Valued Functions , 1995, J. Comput. Syst. Sci..

[7] Vladimir Vovk,et al. A game of prediction with expert advice , 1995, COLT '95.

[8] László Györfi,et al. A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[9] Jason Weston,et al. Support vector machines for multi-class pattern recognition , 1999, ESANN.

[10] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[11] Gábor Lugosi,et al. Introduction to Statistical Learning Theory , 2004, Advanced Lectures on Machine Learning.

[12] Y. Singer,et al. Ultraconservative online algorithms for multiclass problems , 2003 .

[13] Adam Tauman Kalai,et al. From Batch to Transductive Online Learning , 2005, NIPS.

[14] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .

[15] Daniil Ryabko,et al. Pattern Recognition for Conditionally Independent Data , 2005, J. Mach. Learn. Res..

[16] Yoram Singer,et al. Online multiclass learning by interclass hypothesis sharing , 2006, ICML.

[17] J. Langford,et al. The Epoch-Greedy algorithm for contextual multi-armed bandits , 2007, NIPS 2007.

[18] Gábor Lugosi,et al. Learning correlated equilibria in games with compact sets of strategies , 2007, Games Econ. Behav..

[19] Ambuj Tewari,et al. Efficient bandit algorithms for online multiclass prediction , 2008, ICML '08.

[20] Shai Shalev-Shwartz,et al. Agnostic Online learnability , 2008 .