Strategies for Prediction Under Imperfect Monitoring

We propose simple randomized strategies for sequential prediction under imperfect monitoring, that is, when the forecaster does not have access to the past outcomes but rather to a feedback signal. The proposed strategies are consistent in the sense that they achieve, asymptotically, the best possible average reward. It was Rustichini (1999) who first proved the existence of such consistent predictors. The forecasters presented here offer the first constructive proof of consistency. Moreover, the proposed algorithms are computationally efficient. We also establish upper bounds for the rates of convergence. In the case of deterministic feedback, these rates are optimal up to logarithmic terms.

[1]  Vladimir Vovk,et al.  Aggregating strategies , 1990, COLT '90.

[2]  D. Freedman On Tail Probabilities for Martingales , 1975 .

[3]  Vladimir Vovk,et al.  A game of prediction with expert advice , 1995, COLT '95.

[4]  Microeconomics-Charles W. Upton Repeated games , 2020, Game Theory.

[5]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .

[6]  David Haussler,et al.  How to use expert advice , 1993, STOC.

[7]  Christian Schindelhauer,et al.  Discrete Prediction Games with Arbitrary Feedback and Loss , 2001, COLT/EuroCOLT.

[8]  S. Hart,et al.  A simple adaptive procedure leading to correlated equilibrium , 2000 .

[9]  A. Rustichini Minimizing Regret : The General Case , 1999 .

[10]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[11]  Xiaohong Chen,et al.  Laws of Large Numbers for Hilbert Space-Valued Mixingales with Applications , 1996, Econometric Theory.

[12]  James Hannan,et al.  4. APPROXIMATION TO RAYES RISK IN REPEATED PLAY , 1958 .

[13]  G. Lugosi,et al.  On Prediction of Individual Sequences , 1998 .

[14]  Claudio Gentile,et al.  Adaptive and Self-Confident On-Line Learning Algorithms , 2000, J. Comput. Syst. Sci..

[15]  A. Banos On Pseudo-Games , 1968 .

[16]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[17]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[18]  Yishay Mansour,et al.  Improved second-order bounds for prediction with expert advice , 2006, Machine Learning.

[19]  Nicolò Cesa-Bianchi,et al.  Regret Minimization Under Partial Monitoring , 2006, 2006 IEEE Information Theory Workshop - ITW '06 Punta del Este.

[20]  Manfred K. Warmuth,et al.  The Weighted Majority Algorithm , 1994, Inf. Comput..

[21]  Kazuoki Azuma WEIGHTED SUMS OF CERTAIN DEPENDENT RANDOM VARIABLES , 1967 .

[22]  Nicolò Cesa-Bianchi,et al.  Analysis of two gradient-based algorithms for on-line regression , 1997, COLT '97.

[23]  N. Megiddo On repeated games with incomplete information played by non-Bayesian players , 1980 .

[24]  Manfred K. Warmuth,et al.  Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..

[25]  S. Hart,et al.  A Reinforcement Procedure Leading to Correlated Equilibrium , 2001 .

[26]  D. Blackwell Controlled Random Walks , 2010 .

[27]  G. Lugosi,et al.  On Prediction of Individual Sequences , 1998 .

[28]  Tsachy Weissman,et al.  Universal prediction of individual binary sequences in the presence of noise , 2001, IEEE Trans. Inf. Theory.

[29]  Shie Mannor,et al.  On-Line Learning with Imperfect Monitoring , 2003, COLT.

[30]  Tsachy Weissman,et al.  Twofold universal prediction schemes for achieving the finite-state predictability of a noisy individual binary sequence , 2001, IEEE Trans. Inf. Theory.