Learning Finite-State Controllers for Partially Observable Environments
暂无分享,去创建一个
Kee-Eung Kim | Leslie Pack Kaelbling | Nicolas Meuleau | Leonid Peshkin | L. Kaelbling | L. Peshkin | N. Meuleau | Kee-Eung Kim | Nicolas Meuleau
[1] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[2] Karl Johan Åström,et al. Optimal control of Markov processes with incomplete state information , 1965 .
[3] E. J. Sondik,et al. The Optimal Control of Partially Observable Markov Decision Processes. , 1971 .
[4] J. Satia,et al. Markovian Decision Processes with Probabilistic Observation of States , 1973 .
[5] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..
[6] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..
[7] Andrew McCallum,et al. Overcoming Incomplete Perception with Utile Distinction Memory , 1993, ICML.
[8] Leslie Pack Kaelbling,et al. Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.
[9] Michael L. Littman,et al. Memoryless policies: theoretical limitations and practical results , 1994 .
[10] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[11] Michael I. Jordan,et al. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.
[12] Michael I. Jordan,et al. Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.
[13] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[14] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[15] Andrew McCallum,et al. Reinforcement learning with selective perception and hidden state , 1996 .
[16] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[17] Jürgen Schmidhuber,et al. HQ-Learning , 1997, Adapt. Behav..
[18] Eric A. Hansen,et al. An Improved Policy Iteration Algorithm for Partially Observable MDPs , 1997, NIPS.
[19] Milos Hauskrecht,et al. Planning and control in stochastic domains with imperfect information , 1997 .
[20] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[21] Eric A. Hansen,et al. Solving POMDPs by Searching in Policy Space , 1998, UAI.
[22] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.
[23] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[24] A. Cassandra,et al. Exact and approximate algorithms for partially observable markov decision processes , 1998 .
[25] Shlomo Zilberstein,et al. Finite-memory control of partially observable systems , 1998 .
[26] Kee-Eung Kim,et al. Solving POMDPs by Searching the Space of Finite Policies , 1999, UAI.
[27] Leslie Pack Kaelbling,et al. Learning Policies with External Memory , 1999, ICML.
[28] Craig Boutilier,et al. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..