论文信息 - Planning in Stochastic Domains: Problem Characteristics and Approximation

Planning in Stochastic Domains: Problem Characteristics and Approximation

This paper is about planning in stochastic domains by means of partially observable Markov decision processes POMDPs POMDPs are di cult to solve and approxima tion is a must in real world applications Approximation methods can be classi ed into those that solve a POMDP directly and those that approximate a POMDP model by a simpler model Only one previous method falls into the second category It approximates POMDPs by using fully observable Markov decision processes MDPs We propose to approximate POMDPs by using what we call region observable POMDPs Region ob servable POMDPs are more complex than MDPs and yet still solvable They have been empirically shown to yield signi cantly better approximate policies than MDPs In the process of designing an algorithm for solving region observable POMDPs we also propose a new method for attacking the core problem known as dynamic programming updates that one has to face in solving POMDPs We have shown elsewhere that the new method is signi cantly more e cient that the best previous method

Wenju Liu | Nevin L. Zhang | N. Zhang | Wenjun Liu | Wenju Liu

[1] Edward J. Sondik,et al. The optimal control of par-tially observable Markov processes , 1971 .

[2] James N. Eagle. The Optimal Search for a Moving Target When the Search Path Is Constrained , 1984, Oper. Res..

[3] Dimitri P. Bertsekas,et al. Dynamic Programming: Deterministic and Stochastic Models , 1987 .

[4] William S. Lovejoy,et al. Computationally Feasible Bounds for Partially Observed Markov Decision Processes , 1991, Oper. Res..

[5] Michael P. Wellman,et al. Planning and Control , 1991 .

[6] Leslie Pack Kaelbling,et al. Planning With Deadlines in Stochastic Domains , 1993, AAAI.

[7] Leslie Pack Kaelbling,et al. Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.

[8] Chelsea C. White,et al. Finite-Memory Suboptimal Design for Partially Observed Markov Decision Processes , 1994, Oper. Res..

[9] M. Littman. The Witness Algorithm: Solving Partially Observable Markov Decision Processes , 1994 .

[10] Stuart J. Russell,et al. Approximating Optimal Policies for Partially Observable Stochastic Domains , 1995, IJCAI.

[11] Leslie Pack Kaelbling,et al. Acting under uncertainty: discrete Bayesian models for mobile-robot navigation , 1996, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS '96.

[12] Appendix a Proof of Theorem 3.1: We Represent an Instance by a Bipartite Graph , .