Shaping multi-agent systems with gradient reinforcement learning
暂无分享,去创建一个
Olivier Buffet | François Charpillet | Alain Dutech | O. Buffet | Alain Dutech | F. Charpillet | A. Dutech
[1] B. Skinner,et al. Science and human behavior , 1953 .
[2] Christopher M. Bishop,et al. Advances in Neural Information Processing Systems 8 (NIPS 1995) , 1991 .
[3] Toby Tyrrell,et al. Computational mechanisms for action selection , 1993 .
[4] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[5] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[6] Michael I. Jordan,et al. Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.
[7] David Carmel,et al. Opponent Modeling in Multi-Agent Systems , 1995, Adaption and Learning in Multi-Agent Systems.
[8] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[9] Sandip Sen,et al. Adaption and Learning in Multi-Agent Systems , 1995, Lecture Notes in Computer Science.
[10] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.
[11] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[12] Andrew McCallum,et al. Reinforcement learning with selective perception and hidden state , 1996 .
[13] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[14] Craig Boutilier,et al. Planning, Learning and Coordination in Multiagent Decision Processes , 1996, TARK.
[15] Edmund H. Durfee,et al. Agents Learning about Agents: A Framework and Analysis , 1997 .
[16] Maja J. Mataric,et al. Reinforcement Learning in the Multi-Robot Domain , 1997, Auton. Robots.
[17] Michael P. Wellman,et al. Online learning about other agents in a dynamic multiagent system , 1998, AGENTS '98.
[18] Preben Alstrøm,et al. Learning to Drive a Bicycle Using Reinforcement Learning and Shaping , 1998, ICML.
[19] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[20] A. Cassandra,et al. Exact and approximate algorithms for partially observable markov decision processes , 1998 .
[21] Kagan Tumer,et al. General principles of learning-based multi-agent systems , 1999, AGENTS '99.
[22] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[23] Manuela M. Veloso,et al. Team-partitioned, opaque-transition reinforcement learning , 1999, AGENTS '99.
[24] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[25] Kagan Tumer,et al. An Introduction to Collective Intelligence , 1999, ArXiv.
[26] G. Di Caro,et al. Ant colony optimization: a new meta-heuristic , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).
[27] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.
[28] Manuela M. Veloso,et al. Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.
[29] Victor R. Lesser,et al. Communication in multi-agent Markov decision processes , 2000, Proceedings Fourth International Conference on MultiAgent Systems.
[30] Kee-Eung Kim,et al. Learning to Cooperate via Policy Search , 2000, UAI.
[31] Kagan Tumer,et al. Collective Intelligence and Braess' Paradox , 2000, AAAI/IAAI.
[32] Alain Dutech,et al. Solving POMDPs Using Selected Past Events , 2000, ECAI.
[33] Manuela M. Veloso,et al. Layered Learning , 2000, ECML.
[34] Jette Randløv,et al. Shaping in Reinforcement Learning by Changing the Physics of the Problem , 2000, ICML.
[35] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[36] Peter L. Bartlett,et al. Experiments with Infinite-Horizon, Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[37] Learning in Large Cooperative Multi-Robot Domains , 2001 .
[38] Bernhard Hengst,et al. Discovering Hierarchy in Reinforcement Learning with HEXQ , 2002, ICML.
[39] Milind Tambe,et al. The Communicative Multiagent Team Decision Problem: Analyzing Teamwork Theories and Models , 2011, J. Artif. Intell. Res..
[40] Yoav Shoham,et al. Multi-Agent Reinforcement Learning:a critical survey , 2003 .
[41] Eduardo F. Camacho,et al. Adaption and learning , 2003 .
[42] Claudia V. Goldman,et al. Decentralized language learning through acting , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..
[43] Brian P. Gerkey,et al. A Formal Analysis and Taxonomy of Task Allocation in Multi-Robot Systems , 2004, Int. J. Robotics Res..
[44] C. Rogowski. Model-based Opponent Modelling in Domains Beyond the Prisoner's Dilemma , 2004 .
[45] Jürgen Schmidhuber,et al. Learning Team Strategies: Soccer Case Studies , 1998, Machine Learning.
[46] Olivier Buffet,et al. Self-Growth of Basic Behaviors in an Action Selection Based Agent , 2004 .
[47] G. DeJong,et al. Theory and Application of Reward Shaping in Reinforcement Learning , 2004 .
[48] Prashant Doshi,et al. Interactive POMDPs: properties and preliminary results , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..
[49] Olivier Buffet,et al. Développement autonome des comportements de base d'un agent , 2005, Rev. d'Intelligence Artif..
[50] Luís Torgo,et al. Machine Learning: ECML 2005, 16th European Conference on Machine Learning, Porto, Portugal, October 3-7, 2005, Proceedings , 2005, ECML.
[51] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[52] Minoru Asada,et al. Purposive Behavior Acquisition for a Real Robot by Vision-Based Reinforcement Learning , 2005, Machine Learning.
[53] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[54] O. Buffet. The Factored Policy Gradient planner ( IPC-06 Version ) , 2006 .