Reinforcement Learning Algorithms in Markov Decision Processes AAAI-10 Tutorial Part IV: Take home message
暂无分享,去创建一个
Doina Precup | Richard S. Sutton | Csaba Szepesvári | Satinder Singh | Cosmin Paduraru | Anna Koop | R. Sutton | Satinder Singh | Csaba Szepesvari | Doina Precup | Cosmin Paduraru | Anna Koop
[1] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[2] Richard S. Sutton,et al. Temporal Abstraction in Temporal-difference Networks , 2005, NIPS.
[3] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[4] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[5] Vladislav Tadic,et al. On the Convergence of Temporal-Difference Learning with Linear Function Approximation , 2001, Machine Learning.
[6] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[7] Warren B. Powell,et al. “Approximate dynamic programming: Solving the curses of dimensionality” by Warren B. Powell , 2007, Wiley Series in Probability and Statistics.
[8] Abhijit Gosavi,et al. Simulation-Based Optimization: Parametric Optimization Techniques and Reinforcement Learning , 2003 .
[9] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[10] Bart De Schutter,et al. Reinforcement Learning and Dynamic Programming Using Function Approximators , 2010 .
[11] Xi-Ren Cao,et al. Stochastic learning and optimization - A sensitivity-based approach , 2007, Annu. Rev. Control..
[12] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[13] Sanjoy Dasgupta,et al. Off-Policy Temporal Difference Learning with Function Approximation , 2001, ICML.
[14] David K. Smith,et al. Dynamic Programming and Optimal Control. Volume 1 , 1996 .
[15] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[16] John N. Tsitsiklis,et al. Analysis of Temporal-Diffference Learning with Function Approximation , 1996, NIPS.