暂无分享,去创建一个
Kee-Eung Kim | Leslie Pack Kaelbling | Nicolas Meuleau | Leonid Peshkin | L. Kaelbling | L. Peshkin | N. Meuleau | Kee-Eung Kim
[1] Kenneth J. Arrow,et al. Stability of the Gradient Process in n-Person Games , 1960 .
[2] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..
[3] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..
[4] Michael L. Littman,et al. Memoryless policies: theoretical limitations and practical results , 1994 .
[5] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[6] Michael I. Jordan,et al. Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.
[7] Ariel Rubinstein,et al. A Course in Game Theory , 1995 .
[8] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[9] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[10] Michael P. Wellman,et al. Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.
[11] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.
[12] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[13] L. Baird. Reinforcement Learning Through Gradient Descent , 1999 .
[14] Craig Boutilier,et al. Sequential Optimality and Coordination in Multiagent Systems , 1999, IJCAI.
[15] Leslie Pack Kaelbling,et al. Learning Policies with External Memory , 1999, ICML.
[16] Andrew W. Moore,et al. Distributed Value Functions , 1999, ICML.
[17] Kee-Eung Kim,et al. Learning Finite-State Controllers for Partially Observable Environments , 1999, UAI.
[18] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.