Planning in Stochastic Domains: Problem Characteristics and Approximation

This paper is about planning in stochastic domains by means of partially observable Markov decision processes POMDPs POMDPs are di cult to solve and approxima tion is a must in real world applications Approximation methods can be classi ed into those that solve a POMDP directly and those that approximate a POMDP model by a simpler model Only one previous method falls into the second category It approximates POMDPs by using fully observable Markov decision processes MDPs We propose to approximate POMDPs by using what we call region observable POMDPs Region ob servable POMDPs are more complex than MDPs and yet still solvable They have been empirically shown to yield signi cantly better approximate policies than MDPs In the process of designing an algorithm for solving region observable POMDPs we also propose a new method for attacking the core problem known as dynamic programming updates that one has to face in solving POMDPs We have shown elsewhere that the new method is signi cantly more e cient that the best previous method