Online Markov Decision Processes
暂无分享,去创建一个
[1] James Hannan,et al. 4. APPROXIMATION TO RAYES RISK IN REPEATED PLAY , 1958 .
[2] David Haussler,et al. How to use expert advice , 1993, STOC.
[3] Manfred K. Warmuth,et al. The Weighted Majority Algorithm , 1994, Inf. Comput..
[4] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[5] Manfred K. Warmuth,et al. Additive versus exponentiated gradient updates for linear prediction , 1995, STOC '95.
[6] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[7] Yoram Singer,et al. On‐Line Portfolio Selection Using Multiplicative Updates , 1998, ICML.
[8] Manfred K. Warmuth,et al. Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..
[9] Allan Borodin,et al. Online computation and competitive analysis , 1998 .
[10] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[11] Claudio Gentile,et al. Adaptive and Self-Confident On-Line Learning Algorithms , 2000, J. Comput. Syst. Sci..
[12] Sham M. Kakade,et al. On the sample complexity of reinforcement learning. , 2003 .
[13] Avrim Blum,et al. Planning in the Presence of Cost Functions Controlled by an Adversary , 2003, ICML.
[14] Adam Tauman Kalai,et al. Universal Portfolios With and Without Transaction Costs , 1997, COLT '97.
[15] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.
[16] Laurent El Ghaoui,et al. Robust Control of Markov Decision Processes with Uncertain Transition Matrices , 2005, Oper. Res..
[17] Laurent El Ghaoui,et al. Robust Solutions to Markov Decision Problems with Uncertain Transition Matrices , 2005 .
[18] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[19] Santosh S. Vempala,et al. Efficient algorithms for online decision problems , 2005, Journal of computer and system sciences (Print).
[20] Nimrod Megiddo,et al. Combining expert advice in reactive environments , 2006, JACM.
[21] John N. Tsitsiklis,et al. NP-Hardness of checking the unichain condition in average cost MDPs , 2006, Oper. Res. Lett..
[22] Shie Mannor,et al. Markov Decision Processes with Arbitrary Reward Processes , 2008, Math. Oper. Res..
[23] U. Rieder,et al. Markov Decision Processes , 2010 .