Decision-Theoretic Planning

■ The recent advances in computer speed and algorithms for probabilistic inference have led to a resurgence of work on planning under uncertainty. The aim is to design AI planners for environments where there might be incomplete or faulty information, where actions might not always have the same results, and where there might be tradeoffs between the different possible outcomes of a plan. Addressing uncertainty in AI, planning algorithms will greatly increase the range of potential applications, but there is plenty of work to be done before we see practical decision-theoretic planning systems. This article outlines some of the challenges that need to be overcome and surveys some of the recent work in the area.

[1]  L. M. M.-T. Theory of Probability , 1929, Nature.

[2]  E. Rowland Theory of Games and Economic Behavior , 1946, Nature.

[3]  J. Ulrich [Physiology of the heart]. , 1950, Zeitschrift fur Kreislaufforschung.

[4]  H. Simon,et al.  A Behavioral Model of Rational Choice , 1955 .

[5]  S. Vajda,et al.  GAMES AND DECISIONS; INTRODUCTION AND CRITICAL SURVEY. , 1958 .

[6]  Ronald A. Howard,et al.  Dynamic Programming and Markov Processes , 1960 .

[7]  H. Warner,et al.  A mathematical approach to medical diagnosis. Application to congenital heart disease. , 1961, JAMA.

[8]  Alvin W Drake,et al.  Observation of a Markov process through a noisy channel , 1962 .

[9]  Donald W. Loveland,et al.  A machine program for theorem-proving , 2011, CACM.

[10]  W. Cleland Ventricular septal defects. , 1965, Acta chirurgica Scandinavica. Supplementum.

[11]  Ronald A. Howard,et al.  Information Value Theory , 1966, IEEE Trans. Syst. Sci. Cybern..

[12]  A C Dornhorst,et al.  Review of Medical Physiology. , 1966 .

[13]  Karl Johan Åström,et al.  Optimal control of Markov processes with incomplete state-information II. The convexity of the lossfunction , 1969 .

[14]  G. Gorry,et al.  Experience with a model of sequential diagnosis. , 2011, Computers and biomedical research, an international journal.

[15]  L. R. John,et al.  An inverse transmission line model of the lower limb arterial system , 1970 .

[16]  Edward J. Sondik,et al.  The optimal control of par-tially observable Markov processes , 1971 .

[17]  H. Ruser [Spontaneous closure of ventricular septal defect]. , 1971, Zeitschrift fur Kreislaufforschung.

[18]  R. M. Adelson,et al.  Utility Theory for Decision Making , 1971 .

[19]  Richard Fikes,et al.  STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.

[20]  G. Gorry Computer-assisted clinical decision-making. , 1973, Methods of information in medicine.

[21]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..

[22]  E. Polak Introduction to linear and nonlinear programming , 1973 .

[23]  A. Tversky,et al.  Judgment under Uncertainty: Heuristics and Biases , 1974, Science.

[24]  F. T. Dedombal Computer-aided diagnosis and decision-making in the acute abdomen. , 1975, Journal of the Royal College of Physicians of London.

[25]  Edward H. Shortliffe,et al.  A model of inexact reasoning in medicine , 1990 .

[26]  Austin Tate,et al.  Generating Project Networks , 1977, IJCAI.

[27]  Loren K. Platzman,et al.  Finite memory estimation and control of finite probabilistic systems , 1977 .

[28]  Jerome A. Feldman,et al.  Decision Theory and Artificial Intelligence II: The Hungry Monkey , 1977, Cogn. Sci..

[29]  A. Moss,et al.  Heart disease in infants, children, and adolescents , 1977 .

[30]  Casimir A. Kulikowski,et al.  A Model-Based Method for Computer-Aided Medical Decision-Making , 1978, Artif. Intell..

[31]  J. Habbema,et al.  The Measurement of Performance in Probabilistic Diagnosis IV. Utility Considerations in Therapeutics and Prognostics , 1981, Methods of Information in Medicine.

[32]  Edward H. Shortliffe,et al.  ONCOCIN: An Expert System for Oncology Protocol Management , 1981, IJCAI.

[33]  F. Fairman Introduction to dynamic systems: Theory, models and applications , 1979, Proceedings of the IEEE.

[34]  John McCarthy,et al.  SOME PHILOSOPHICAL PROBLEMS FROM THE STANDPOINT OF ARTI CIAL INTELLIGENCE , 1987 .

[35]  H. E. Pople,et al.  Internist-1, an experimental computer-based diagnostic consultant for general internal medicine. , 1982, The New England journal of medicine.

[36]  Drew McDermott,et al.  A Temporal Logic for Reasoning About Processes and Plans , 1982, Cogn. Sci..

[37]  James B. Moen,et al.  Recognition-Based Diagnostic Reasoning , 1983, IJCAI.

[38]  Scott M. Olmsted On representing and solving decision problems , 1983 .

[39]  G. Giboney,et al.  Ventricular septal defect , 2018, Operative Cardiac Surgery.

[40]  G. Debreu Mathematical Economics: Representation of a preference ordering by a numerical function , 1983 .

[41]  T. Speed,et al.  Recursive causal models , 1984, Journal of the Australian Mathematical Society. Series A. Pure Mathematics and Statistics.

[42]  Ross D. Shachter Evaluating Influence Diagrams , 1986, Oper. Res..

[43]  G. Sutton Computer aided diagnosis of acute abdominal pain , 1986, British medical journal.

[44]  Lawrence M. Fagan,et al.  A therapy planning architecture that combines decision theory and artificial intelligence techniques. , 1990, Computers and biomedical research, an international journal.

[45]  V. Lifschitz Formal theories of action , 1987 .

[46]  John N. Tsitsiklis,et al.  The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..

[47]  Allen Newell,et al.  GPS, a program that simulates human thought , 1995 .

[48]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[49]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[50]  Michael P. Wellman Formulation of tradeoffs in planning under uncertainty , 1988 .

[51]  H. Kyburg Normative and Descriptive Ideas , 1988 .

[52]  Gregory F. Cooper,et al.  A Method for Using Belief Networks as Influence Diagrams , 2013, UAI 1988.

[53]  T. Geva,et al.  Reappraisal of the approach to the child with heart murmurs: is echocardiography mandatory? , 1988, International journal of cardiology.

[54]  Carlo Berzuini Representing Time in Causal Probabilistic Networks , 1989, UAI.

[55]  Dan Geiger,et al.  d-Separation: From Theorems to Algorithms , 2013, UAI.

[56]  David J. Spiegelhalter,et al.  Assessment, Criticism and Improvement of Imprecise Subjective Probabilities for a Medical Expert System , 1989, UAI.

[57]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[58]  Keiji Kanazawa,et al.  A model for reasoning about persistence and causation , 1989 .

[59]  Ross D. Shachter,et al.  Dynamic programming and influence diagrams , 1990, IEEE Trans. Syst. Man Cybern..

[60]  Steffen L. Lauritzen,et al.  Bayesian updating in causal probabilistic networks by local computations , 1990 .

[61]  Gregory F. Cooper,et al.  The Computational Complexity of Probabilistic Inference Using Bayesian Belief Networks , 1990, Artif. Intell..

[62]  D. Heckerman,et al.  Probabilistic diagnosis using a reformulation of the INTERNIST-1/QMR knowledge base. II. Evaluation of diagnostic performance. , 1991, Methods of information in medicine.

[63]  D. Spiegelhalter,et al.  Evaluation of a diagnostic algorithm for heart disease in neonates. , 1991, BMJ.

[64]  W. Lovejoy A survey of algorithmic methods for partially observed Markov decision processes , 1991 .

[65]  M G Kahn,et al.  Modeling Time in Medical Decision-support Programs , 1991, Medical decision making : an international journal of the Society for Medical Decision Making.

[66]  J. Moller,et al.  Late results (30 to 35 years) after operative closure of isolated ventricular septal defect from 1954 to 1960. , 1991, The American journal of cardiology.

[67]  Geetha Abeysinghe,et al.  A Temporal Model for Clinical and Resource Management in Vascular Surgery , 1991, DEXA.

[68]  Craig A. Knoblock Automatically generating abstractions for problem solving , 1991 .

[69]  Michael P. Wellman,et al.  Planning and Control , 1991 .

[70]  Eric Horvitz,et al.  Dynamic Network Models for Forecasting , 1992, UAI.

[71]  Mark A. Peot,et al.  Conditional nonlinear planning , 1992 .

[72]  Daniel S. Weld,et al.  UCPOP: A Sound, Complete, Partial Order Planner for ADL , 1992, KR.

[73]  Robert P. Goldman,et al.  From knowledge bases to decision models , 1992, The Knowledge Engineering Review.

[74]  Prakash P. Shenoy,et al.  Valuation-Based Systems for Bayesian Decision Analysis , 1992, Oper. Res..

[75]  Raymond Reiter,et al.  Characterizing Diagnoses and Systems , 1992, Artif. Intell..

[76]  Uffe Kjærulff,et al.  A Computational Scheme for Reasoning in Dynamic Probabilistic Networks , 1992, UAI.

[77]  D. McDermott Transformational Planning of Reactive Behavior , 1992 .

[78]  Gregory M. Provan,et al.  Dynamic Network Construction and Updating Techniques for the Diagnosis of Acute Abdominal Pain , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[79]  J R Beck,et al.  Markov Models in Medical Decision Making , 1993, Medical decision making : an international journal of the Society for Medical Decision Making.

[80]  James E. Smith,et al.  Structuring Conditional Relationships in Influence Diagrams , 1993, Oper. Res..

[81]  Silvana Quaglini,et al.  Sharing and reusing therapeutic knowledge for managing leukemic children , 1993 .

[82]  Leslie Pack Kaelbling,et al.  Planning With Deadlines in Stochastic Domains , 1993, AAAI.

[83]  David J. Spiegelhalter,et al.  Bayesian analysis in expert systems , 1993 .

[84]  R. R. Wolfe,et al.  Second natural history study of congenital heart defects. Results of treatment of patients with ventricular septal defects. , 1993, Circulation.

[85]  L. van der Gaag,et al.  Selective evidence gathering for diagnostic belief networks , 1993 .

[86]  David Poole,et al.  Probabilistic Horn Abduction and Bayesian Networks , 1993, Artif. Intell..

[87]  F. B. Vernadat,et al.  Decisions with Multiple Objectives: Preferences and Value Tradeoffs , 1994 .

[88]  Craig Boutilier,et al.  Using Abstractions for Decision-Theoretic Planning with Time Constraints , 1994, AAAI.

[89]  Stuart J. Russell,et al.  Control Strategies for a Stochastic Planner , 1994, AAAI.

[90]  Jaime G. Carbonell,et al.  Control Knowledge to Improve Plan Quality , 1994, AIPS.

[91]  Frank Jensen,et al.  From Influence Diagrams to junction Trees , 1994, UAI.

[92]  Daniel S. Weld A gentle introduction to least-commitment planning , 1994 .

[93]  Steve Hanks,et al.  Optimal Planning with a Goal-directed Utility Model , 1994, AIPS.

[94]  Leslie Pack Kaelbling,et al.  Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.

[95]  Robert P. Goldman,et al.  Epsilon-Safe Planning , 1994, UAI.

[96]  Peter Haddawy,et al.  Decision-theoretic Refinement Planning Using Inheritance Abstraction , 1994, AIPS.

[97]  Michael L. Littman,et al.  Memoryless policies: theoretical limitations and practical results , 1994 .

[98]  Daniel S. Weld,et al.  Probabilistic Planning with Information Gathering and Contingent Execution , 1994, AIPS.

[99]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[100]  Chelsea C. White,et al.  Finite-Memory Suboptimal Design for Partially Observed Markov Decision Processes , 1994, Oper. Res..

[101]  Daniel S. Weld An Introduction to Least Commitment Planning , 1994, AI Mag..

[102]  Jim Blythe,et al.  Planning with External Events , 1994, UAI.

[103]  Nicholas Kushmerick,et al.  An Algorithm for Probabilistic Least-Commitment Planning , 1994, AAAI.

[104]  J. Pearl,et al.  Symbolic Causal Networks for Reasoning about Actions and Plans , 1994 .

[105]  Leslie Pack Kaelbling,et al.  Planning under Time Constraints in Stochastic Domains , 1993, Artif. Intell..

[106]  Nicholas Kushmerick,et al.  An Algorithm for Probabilistic Planning , 1995, Artif. Intell..

[107]  Marek J. Druzdzel,et al.  Elicitation of Probabilities for Belief Networks: Combining Qualitative and Quantitative Information , 1995, UAI.

[108]  John D. Lowrance,et al.  Planning and reacting in uncertain and dynamic environments , 1995, J. Exp. Theor. Artif. Intell..

[109]  David Madigan,et al.  Probabilistic Temporal Reasoning with Endogenous Change , 1995, UAI.

[110]  Avrim Blum,et al.  Fast Planning Through Planning Graph Analysis , 1995, IJCAI.

[111]  Thomas Dean,et al.  Decomposition Techniques for Planning in Stochastic Domains , 1995, IJCAI.

[112]  Stuart J. Russell,et al.  Approximating Optimal Policies for Partially Observable Stochastic Domains , 1995, IJCAI.

[113]  Peter Haddawy,et al.  Efficient Decision-Theoretic Planning: Techniques and Empirical Analysis , 1995, UAI.

[114]  J. Wyatt,et al.  Commentary: Prognostic models: clinically useful or quickly forgotten? , 1995 .

[115]  Ali Jenzarli,et al.  Information/Relevance Influence Diagrams , 1995, UAI.

[116]  Eugene Fink,et al.  Integrating planning and learning: the PRODIGY architecture , 1995, J. Exp. Theor. Artif. Intell..

[117]  Craig Boutilier,et al.  Exploiting Structure in Policy Construction , 1995, IJCAI.

[118]  P. Haddawy,et al.  Decision-theoretic Refinement Planning in Medical Decision Making , 1996, Medical decision making : an international journal of the Society for Medical Decision Making.

[119]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[120]  Bart Selman,et al.  Pushing the Envelope: Planning, Propositional Logic and Stochastic Search , 1996, AAAI/IAAI, Vol. 2.

[121]  John Mark Agosta Constraining Influence Diagram Structure by Generative Planning: An Application to the Optimization of Oil Spill Response , 1996, UAI.

[122]  Peter Haddawy,et al.  A Logic of Time, Chance, and Action for Representing Plans , 1996, Artif. Intell..

[123]  David L. Poole,et al.  A Framework for Decision-Theoretic Planning I: Combining the Situation Calculus, Conditional Plans, Probability and Utility , 1996, UAI.

[124]  Michael L. Littman,et al.  Algorithms for Sequential Decision Making , 1996 .

[125]  Gregg Collins,et al.  Planning for Contingencies: A Decision-based Approach , 1996, J. Artif. Intell. Res..

[126]  Jim Blythe The footprint principle for heuristics for probabilistic planners , 1996 .

[127]  William J. Long,et al.  Temporal reasoning for diagnosis in a causal probabilistic knowledge base , 1996, Artif. Intell. Medicine.

[128]  T. Dean,et al.  Planning under uncertainty: structural assumptions and computational leverage , 1996 .

[129]  Robert Givan,et al.  Model Minimization, Regression, and Propositional STRIPS Planning , 1997, IJCAI.

[130]  Ronen I. Brafman,et al.  Prioritized Goal Decomposition of Markov Decision Processes: Toward a Synthesis of Classical and Decision Theoretic Planning , 1997, IJCAI.

[131]  William J. Long,et al.  Reasoning requirements for diagnosis of heart disease , 1997, Artif. Intell. Medicine.

[132]  R Bellazzi,et al.  DT-Planner: an environment for managing dynamic decision problems. , 1997, Computer methods and programs in biomedicine.

[133]  Martha E. Pollack,et al.  Contingency Selection in Plan Generation , 1997, ECP.

[134]  Robert Givan,et al.  Model Minimization in Markov Decision Processes , 1997, AAAI/IAAI.

[135]  D. Danford,et al.  Children with heart murmurs: can ventricular septal defect be diagnosed reliably without an echocardiogram? , 1997, Journal of the American College of Cardiology.

[136]  Milos Hauskrecht,et al.  Incremental Methods for Computing Bounds in Partially Observable Markov Decision Processes , 1997, AAAI/IAAI.

[137]  Ronen I. Brafman,et al.  A Heuristic Variable Grid Solution Method for POMDPs , 1997, AAAI/IAAI.

[138]  Niels Peek,et al.  Developing a Decision-Theoretic Network for a Congenital Heart Disease , 1997, AIME.

[139]  Cungen Cao,et al.  Modelling Medical Decisions in DynaMoL: A New General Framework of Dynamic Decision Analysis , 1998, MedInfo.

[140]  Johanna D. Moore,et al.  Working Notes of the AAAI Spring Symposium on Interactive and Mixed-Initiative Decision Theoretic Systems , 1998 .

[141]  Michael L. Littman,et al.  MAXPLAN: A New Approach to Probabilistic Planning , 1998, AIPS.

[142]  Reid G. Simmons,et al.  Search Control of Plan Generation in Decision-Theoretic Planners , 1998, AIPS.

[143]  B. McCrindle,et al.  Echocardiography and cardiac catheterization in the preoperative assessment of ventricular septal defect in infancy. , 1998, American heart journal.

[144]  Judy Goldsmith,et al.  Complexity issues in Markov decision processes , 1998, Proceedings. Thirteenth Annual IEEE Conference on Computational Complexity (Formerly: Structure in Complexity Theory Conference) (Cat. No.98CB36247).

[145]  Tze-Yun Leong,et al.  Multiple Perspective Dynamic Decision Making , 1998, Artif. Intell..

[146]  Nevin Lianwen Zhang,et al.  Probabilistic Inference in Influence Diagrams , 1998, Comput. Intell..

[147]  David E. Smith,et al.  Conformant Graphplan , 1998, AAAI/IAAI.

[148]  David E. Smith,et al.  Extending Graphplan to handle uncertainty and sensing actions , 1998, AAAI 1998.

[149]  Michael L. Littman,et al.  The Computational Complexity of Probabilistic Planning , 1998, J. Artif. Intell. Res..

[150]  Peter Haddawy,et al.  Utility Models for Goal‐Directed, Decision‐Theoretic Planners , 1998, Comput. Intell..

[151]  Jim Blythe,et al.  Planning Under Uncertainty in Dynamic Domains , 1998 .

[152]  R. Bellazzi,et al.  The Optimal Dynamic Therapy: a Decision-theoretic Approach , 1998 .

[153]  Stephen S. Lee,et al.  Planning with Partially Observable Markov Decision Processes: Advances in Exact Solution Method , 1998, UAI.

[154]  Ronen I. Brafman,et al.  Structured Reachability Analysis for Markov Decision Processes , 1998, UAI.

[155]  Peter J. F. Lucas,et al.  Analysis of Notions of Diagnosis , 1998, Artif. Intell..

[156]  Anne Condon,et al.  On the Undecidability of Probabilistic Planning and Infinite-Horizon Partially Observable Markov Decision Problems , 1999, AAAI/IAAI.

[157]  Niels Peek,et al.  Focused quantification of a belief network using sensitivity analysis , 1999 .

[158]  Yuval Shahar,et al.  Timing Is Everything: Temporal Reasoning and Temporal Data Maintenance in Medicine , 1999, AIMDM.

[159]  Bruce D'Ambrosio,et al.  Inference in Bayesian Networks , 1999, AI Mag..

[160]  N. B. Peek,et al.  A specialized POMDP form and algorithm for clinical patient management , 1999 .

[161]  M. Meldrum Rationalizing Medical Work: Decision-Support Techniques and Medical Practices , 1999 .

[162]  Niels Peek,et al.  Explicit temporal models for decision-theoretic planning of clinical management , 1999, Artif. Intell. Medicine.

[163]  R. V. D. Pol Knowledge-based query formulation in information retrieval , 2000 .

[164]  G. de Haan,et al.  ETAG, A Formal Model of Competence Knowledge for User Interface Design , 2000 .

[165]  N. Peek Decision-theoretic planning of clinical patient management , 2000 .

[166]  J. McCarthy CIRCUMSCRIPTION — A FORM OF NONMONOTONIC REASONING , 2007 .