论文信息 - Planning with Robust (L)RTDP

Planning with Robust (L)RTDP

Stochastic Shortest Path problems (SSPs), a subclass of Markov Decision Problems (MDPs), can be efficiently dealt with using Real-Time Dynamic Programming(RTDP). Yet, MDP models are often uncertain (obtained through statistics or guessing). The usual approach is robust planning: searching for the best policy under the worst model. This paper shows how RTDP can be made robust in the common case where transition probabilities are known to lie in a given interval.

O. Buffet

[1] M. Abramowitz,et al. Handbook of Mathematical Functions With Formulas, Graphs and Mathematical Tables (National Bureau of Standards Applied Mathematics Series No. 55) , 1965 .

[2] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[3] D. Bertsekas,et al. Stochastic Shortest Path Games , 1999 .

[4] Robert Givan,et al. Bounded-parameter Markov decision processes , 2000, Artif. Intell..

[5] Rémi Munos. Efficient Resources Allocation for Markov Decision Processes , 2001, NIPS.

[6] Masanori Hosaka,et al. Controlled Markov set-chains under average criteria , 2001, Appl. Math. Comput..

[7] E. Ordentlich,et al. Inequalities for the L1 Deviation of the Empirical Distribution , 2003 .

[8] Blai Bonet,et al. Labeled RTDP: Improving the Convergence of Real-Time Dynamic Programming , 2003, ICAPS.

[9] Laurent El Ghaoui,et al. Robustness in Markov Decision Problems with Uncertain Transition Matrices , 2003, NIPS.

[10] Håkan L. S. Younes,et al. PPDDL 1 . 0 : An Extension to PDDL for Expressing Planning Domains with Probabilistic Effects , 2004 .

[11] Frédérick Garcia,et al. On-Line Search for Solving Markov Decision Processes via Heuristic Sampling , 2004, ECAI.

[12] Michael L. Littman,et al. An empirical evaluation of interval estimation for Markov decision processes , 2004, 16th IEEE International Conference on Tools with Artificial Intelligence.

[13] Lin Zhang,et al. Decision-Theoretic Military Operations Planning , 2004, ICAPS.