暂无分享,去创建一个
[1] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[2] Alvin W Drake,et al. Observation of a Markov process through a noisy channel , 1962 .
[3] Karl Johan Åström,et al. Optimal control of Markov processes with incomplete state information , 1965 .
[4] Howard Raiffa,et al. Decision analysis: introductory lectures on choices under uncertainty. 1968. , 1969, M.D.Computing.
[5] E. J. Sondik,et al. The Optimal Control of Partially Observable Markov Decision Processes. , 1971 .
[6] H. Raiffa,et al. Decision analysis: introductory lectures on choices under uncertainty. 1968. , 1969, M.D. computing : computers in medical practice.
[7] J. Satia,et al. Markovian Decision Processes with Probabilistic Observation of States , 1973 .
[8] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..
[9] Loren K. Platzman,et al. Finite memory estimation and control of finite probabilistic systems , 1977 .
[10] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..
[11] James N. Eagle. The Optimal Search for a Moving Target When the Search Path Is Constrained , 1984, Oper. Res..
[12] E. KorfRichard. Depth-first iterative-deepening: an optimal admissible tree search , 1985 .
[13] Richard E. Korf,et al. Depth-First Iterative-Deepening: An Optimal Admissible Tree Search , 1985, Artif. Intell..
[14] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[15] H. Brachinger,et al. Decision analysis , 1997 .
[16] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..
[17] Judea Pearl,et al. Probabilistic reasoning in intelligent systems , 1988 .
[18] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.
[19] Keiji Kanazawa,et al. A model for reasoning about persistence and causation , 1989 .
[20] Ross D. Shachter,et al. Dynamic programming and influence diagrams , 1990, IEEE Trans. Syst. Man Cybern..
[21] William S. Lovejoy,et al. Computationally Feasible Bounds for Partially Observed Markov Decision Processes , 1991, Oper. Res..
[22] W. Lovejoy. A survey of algorithmic methods for partially observed Markov decision processes , 1991 .
[23] Anne Condon,et al. The Complexity of Stochastic Games , 1992, Inf. Comput..
[24] Uffe Kjærulff,et al. A Computational Scheme for Reasoning in Dynamic Probabilistic Networks , 1992, UAI.
[25] William S. Lovejoy,et al. Suboptimal Policies, with Bounds, for Parameter Adaptive Decision Processes , 1993, Oper. Res..
[26] Ronald J. Williams,et al. Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions , 1993 .
[27] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.
[28] Michael L. Littman,et al. Memoryless policies: theoretical limitations and practical results , 1994 .
[29] Daniel S. Weld,et al. Probabilistic Planning with Information Gathering and Contingent Execution , 1994, AIPS.
[30] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[31] Chelsea C. White,et al. Finite-Memory Suboptimal Design for Partially Observed Markov Decision Processes , 1994, Oper. Res..
[32] Michael I. Jordan,et al. Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.
[33] Geoffrey J. Gordon. Stable Function Approximation in Dynamic Programming , 1995, ICML.
[34] Nicholas Kushmerick,et al. An Algorithm for Probabilistic Planning , 1995, Artif. Intell..
[35] Stuart J. Russell,et al. Stochastic simulation algorithms for dynamic probabilistic networks , 1995, UAI.
[36] Stuart J. Russell,et al. Approximating Optimal Policies for Partially Observable Stochastic Domains , 1995, IJCAI.
[37] Dimitri P. Bertsekas,et al. A Counterexample to Temporal Differences Learning , 1995, Neural Computation.
[38] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.
[39] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[40] Craig Boutilier,et al. Exploiting Structure in Policy Construction , 1995, IJCAI.
[41] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..
[42] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[43] Andrew McCallum,et al. Instance-Based Utile Distinctions for Reinforcement Learning with Hidden State , 1995, ICML.
[44] Michel de Rougemont,et al. On the Complexity of Partially Observed Markov Decision Processes , 1996, Theor. Comput. Sci..
[45] Richard Washington,et al. Incremental Markov-model planning , 1996, Proceedings Eighth IEEE International Conference on Tools with Artificial Intelligence.
[46] Michael Isard,et al. Contour Tracking by Stochastic Propagation of Conditional Density , 1996, ECCV.
[47] Michael L. Littman,et al. Algorithms for Sequential Decision Making , 1996 .
[48] Wenju Liu,et al. A Model Approximation Scheme for Planning in Partially Observable Stochastic Domains , 1997, J. Artif. Intell. Res..
[49] Eric A. Hansen,et al. An Improved Policy Iteration Algorithm for Partially Observable MDPs , 1997, NIPS.
[50] Milos Hauskrecht,et al. Planning and control in stochastic domains with imperfect information , 1997 .
[51] E. Allender,et al. Encyclopaedia of Complexity Results for Finite-Horizon Markov Decision Process Problems , 1997 .
[52] Craig Boutilier,et al. Abstraction and Approximate Decision-Theoretic Planning , 1997, Artif. Intell..
[53] Michael L. Littman,et al. Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes , 1997, UAI.
[54] D. Castañón. Approximate dynamic programming for sensor management , 1997, Proceedings of the 36th IEEE Conference on Decision and Control.
[55] Ronen I. Brafman,et al. A Heuristic Variable Grid Solution Method for POMDPs , 1997, AAAI/IAAI.
[56] Wenju Liu,et al. Region-Based Approximations for Planning in Stochastic Domains , 1997, UAI.
[57] Simon J. Godsill,et al. On sequential simulation-based methods for Bayesian filtering , 1998 .
[58] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[59] Blai Bonet,et al. Learning Sorting and Decision Trees with POMDPs , 1998, ICML.
[60] Xavier Boyen,et al. Tractable Inference for Complex Stochastic Processes , 1998, UAI.
[61] Eric A. Hansen,et al. Solving POMDPs by Searching in Policy Space , 1998, UAI.
[62] A. Cassandra,et al. Exact and approximate algorithms for partially observable markov decision processes , 1998 .
[63] Kirk A. Yost. Solution of large-scale allocation problems with partially observable outcomes , 1998 .
[64] Stephen S. Lee,et al. Planning with Partially Observable Markov Decision Processes: Advances in Exact Solution Method , 1998, UAI.
[65] Michael I. Jordan. Graphical Models , 1998 .
[66] Anne Condon,et al. On the Undecidability of Probabilistic Planning and Infinite-Horizon Partially Observable Markov Decision Problems , 1999, AAAI/IAAI.
[67] David A. McAllester,et al. Approximate Planning for Factored POMDPs using Belief State Simplification , 1999, UAI.
[68] Craig Boutilier,et al. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..
[69] Xavier Boyen,et al. Exploiting the Architecture of Dynamic Systems , 1999, AAAI/IAAI.
[70] Milos Hauskrecht,et al. Planning treatment of ischemic heart disease with partially observable Markov decision processes , 2000, Artif. Intell. Medicine.
[71] N. Zhang,et al. Algorithms for partially observable markov decision processes , 2001 .