Ergodic MDPs Admit Self-Optimising Policies
暂无分享,去创建一个
[1] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[2] Shane Legg,et al. A Taxonomy for Abstract Environments. , 2004 .
[3] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[4] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[5] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Vol. II , 1976 .
[6] Marcus Hutter. Optimal Sequential Decisions based on Algorithmic Probability , 2003, ArXiv.
[7] J. Doob. Stochastic processes , 1953 .