论文信息 - Robust Planning with (L)RTDP

Robust Planning with (L)RTDP

Stochastic Shortest Path problems (SSPs), a subclass of Markov Decision Problems (MDPs), can be efficiently dealt with using Real-Time Dynamic Programming (RTDP). Yet, MDP models are often uncertain (obtained through statistics or guessing). The usual approach is robust planning: searching for the best policy under the worst model. This paper shows how RTDP can be made robust in the common case where transition probabilities are known to lie in a given interval.

Olivier Buffet | Douglas Aberdeen | D. Aberdeen | O. Buffet

[1] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[2] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[3] Robert Givan,et al. Bounded Parameter Markov Decision Processes , 1997, ECP.

[4] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[5] D. Bertsekas,et al. Stochastic Shortest Path Games , 1999 .

[6] Rémi Munos. Efficient Resources Allocation for Markov Decision Processes , 2001, NIPS.

[7] Masanori Hosaka,et al. Controlled Markov set-chains under average criteria , 2001, Appl. Math. Comput..

[8] Blai Bonet,et al. Labeled RTDP: Improving the Convergence of Real-Time Dynamic Programming , 2003, ICAPS.

[9] Laurent El Ghaoui,et al. Robustness in Markov Decision Problems with Uncertain Transition Matrices , 2003, NIPS.

[10] O. Buffet. Robust LRTDP: Reachability Analysis , 2004 .

[11] Michael L. Littman,et al. An empirical evaluation of interval estimation for Markov decision processes , 2004, 16th IEEE International Conference on Tools with Artificial Intelligence.

[12] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[13] O. Buffet. Planning with Robust (L)RTDP , 2005 .