Regret minimization in repeated matrix games with variable stage duration
暂无分享,去创建一个
[1] Shie Mannor,et al. The Empirical Bayes Envelope and Regret Minimization in Competitive Markov Decision Processes , 2003, Math. Oper. Res..
[2] S. Hart,et al. A simple adaptive procedure leading to correlated equilibrium , 2000 .
[3] D. Fudenberg,et al. Conditional Universal Consistency , 1999 .
[4] Ehud Lehrer,et al. A wide range no-regret theorem , 2003, Games Econ. Behav..
[5] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[6] R. Vohra,et al. Calibrated Learning and Correlated Equilibrium , 1996 .
[7] A. Rustichini. Minimizing Regret : The General Case , 1999 .
[8] D. Fudenberg,et al. An Easier Way to Calibrate , 1999 .
[9] A. Shwartz,et al. Guaranteed performance regions in Markovian systems with competing decision makers , 1993, IEEE Trans. Autom. Control..
[10] S. Hart. Adaptive Heuristics , 2005 .
[11] S. Hart,et al. A General Class of Adaptive Strategies , 1999 .
[12] Sham M. Kakade,et al. Deterministic calibration and Nash equilibrium , 2004, J. Comput. Syst. Sci..
[13] Philip Wolfe,et al. Contributions to the theory of games , 1953 .
[14] D. Fudenberg,et al. The Theory of Learning in Games , 1998 .
[15] S. Hart,et al. A general class of adaptative strategies , 1999 .
[16] Yuhong Yang. Elements of Information Theory (2nd ed.). Thomas M. Cover and Joy A. Thomas , 2008 .
[17] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.
[18] Y. Freund,et al. Adaptive game playing using multiplicative weights , 1999 .
[19] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[20] Alvaro Sandroni,et al. Calibration with Many Checking Rules , 2003, Math. Oper. Res..
[21] Yishay Mansour,et al. Experts in a Markov Decision Process , 2004, NIPS.
[22] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[23] D. Fudenberg,et al. Consistency and Cautious Fictitious Play , 1995 .
[24] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[25] D. Blackwell. An analog of the minimax theorem for vector payoffs. , 1956 .
[26] Dean P. Foster,et al. Regret in the On-Line Decision Problem , 1999 .
[27] D. Blackwell. Controlled Random Walks , 2010 .
[28] Dean P. Foster,et al. A Proof of Calibration Via Blackwell's Approachability Theorem , 1999 .
[29] E. Kalai,et al. Calibrated Forecasting and Merging , 1999 .
[30] L. Shapley,et al. Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.
[31] James Hannan,et al. 4. APPROXIMATION TO RAYES RISK IN REPEATED PLAY , 1958 .
[32] Shie Mannor,et al. Online calibrated forecasts: Memory efficiency versus universality for learning in games , 2006, Machine Learning.
[33] Sagnik Sinha,et al. Zero-sum two-person semi-Markov games , 1992, Journal of Applied Probability.