A Hybridized Planner for Stochastic Domains

Markov Decision Processes are a powerful framework for planning under uncertainty, but current algorithms have difficulties scaling to large problems. We present a novel probabilistic planner based on the notion of hybridizing two algorithms. In particular, we hybridize GPT, an exact MDP solver, with MBP, a planner that plans using a qualitative (nondeterministic) model of uncertainty. Whereas exact MDP solvers produce optimal solutions, qualitative planners sacrifice optimality to achieve speed and high scalability. Our hybridized planner, HYBPLAN, is able to obtain the best of both techniques -- speed, quality and scalability. Moreover, HYBPLAN has excellent anytime properties and makes effective use of available time and memory.

[1]  Nils J. Nilsson,et al.  Artificial Intelligence , 1974, IFIP Congress.

[2]  Håkan L. S. Younes,et al.  Policy Generation for Continuous-time Stochastic Domains with Concurrency , 2004, ICAPS.

[3]  Sylvie Thiébaux,et al.  Concurrent Probabilistic Planning in the Graphplan Framework , 2006, ICAPS.

[4]  Joelle Pineau,et al.  Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.

[5]  Blai Bonet,et al.  mGPT: A Probabilistic Planner Based on Heuristic Search , 2005, J. Artif. Intell. Res..

[6]  Marco Pistore,et al.  Weak, strong, and strong cyclic planning via symbolic model checking , 2003, Artif. Intell..

[7]  Paolo Toth,et al.  Knapsack Problems: Algorithms and Computer Implementations , 1990 .

[8]  Jussi Rintanen,et al.  Conditional Planning in the Discrete Belief Space , 2005, IJCAI.

[9]  P. Poupart Exploiting structure to efficiently solve large scale partially observable Markov decision processes , 2005 .

[10]  Bart Selman,et al.  Algorithm portfolios , 2001, Artif. Intell..

[11]  Zhengzhu Feng,et al.  Symbolic heuristic search for factored Markov decision processes , 2002, AAAI/IAAI.

[12]  David E. Smith Choosing Objectives in OverSubscription Planning , 2004 .

[13]  Andrew G. Barto,et al.  Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[14]  Daniel Bryce,et al.  Planning Graph Heuristics for Belief Space Search , 2006, J. Artif. Intell. Res..

[15]  David E. Smith Choosing Objectives in Over-Subscription Planning , 2004, ICAPS.

[16]  Shlomo Zilberstein,et al.  LAO*: A heuristic search algorithm that finds solutions with loops , 2001, Artif. Intell..

[17]  Piergiorgio Bertoli,et al.  Strong planning under partial observability , 2006, Artif. Intell..

[18]  Ronen I. Brafman,et al.  Planning with Continuous Resources in Stochastic Domains , 2005, IJCAI.

[19]  Blai Bonet,et al.  Labeled RTDP: Improving the Convergence of Real-Time Dynamic Programming , 2003, ICAPS.

[20]  Geoffrey J. Gordon,et al.  Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees , 2005, ICML.

[21]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[22]  Piergiorgio Bertoli,et al.  MBP: a Model Based Planner , 2001 .

[23]  Mausam,et al.  Solving Concurrent Markov Decision Processes , 2004, AAAI.

[24]  Mausam,et al.  Concurrent Probabilistic Temporal Planning , 2005, ICAPS.