Conditional random fields for multi-agent reinforcement learning
暂无分享,去创建一个
[1] Kee-Eung Kim,et al. Learning to Cooperate via Policy Search , 2000, UAI.
[2] Thomas Hofmann,et al. Exponential Families for Conditional Random Fields , 2004, UAI.
[3] Jin Yu,et al. Natural Actor-Critic for Road Traffic Optimisation , 2006, NIPS.
[4] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[5] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[6] Craig Boutilier,et al. Sequential Optimality and Coordination in Multiagent Systems , 1999, IJCAI.
[7] Michail G. Lagoudakis,et al. Coordinated Reinforcement Learning , 2002, ICML.
[8] David J. C. MacKay,et al. Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.
[9] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[10] Leslie Pack Kaelbling,et al. Representing hierarchical POMDPs as DBNs for multi-scale robot localization , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.
[11] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[12] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.
[13] Andrew W. Moore,et al. Distributed Value Functions , 1999, ICML.
[14] G. Casella,et al. Rao-Blackwellisation of sampling schemes , 1996 .
[15] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.
[16] Nando de Freitas,et al. From Fields to Trees , 2004, UAI.
[17] Andrew Y. Ng,et al. On Local Rewards and Scaling Distributed Reinforcement Learning , 2005, NIPS.
[18] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.