Online Learning with Variable Stage Duration
暂无分享,去创建一个
[1] Y. Freund,et al. Adaptive game playing using multiplicative weights , 1999 .
[2] S. Vajda,et al. Contribution to the Theory of Games , 1951 .
[3] D. Fudenberg,et al. Consistency and Cautious Fictitious Play , 1995 .
[4] D. Blackwell. An analog of the minimax theorem for vector payoffs. , 1956 .
[5] Shie Mannor,et al. Regret minimization in repeated matrix games with variable stage duration , 2008, Games Econ. Behav..
[6] Sagnik Sinha,et al. Zero-sum two-person semi-Markov games , 1992, Journal of Applied Probability.
[7] Shie Mannor,et al. The Empirical Bayes Envelope and Regret Minimization in Competitive Markov Decision Processes , 2003, Math. Oper. Res..
[8] D. Fudenberg,et al. An Easier Way to Calibrate , 1999 .
[9] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[10] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[11] Dean P. Foster,et al. Regret in the On-Line Decision Problem , 1999 .
[12] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[13] Sham M. Kakade,et al. Deterministic calibration and Nash equilibrium , 2004, J. Comput. Syst. Sci..
[14] R. Vohra,et al. Calibrated Learning and Correlated Equilibrium , 1996 .