Value Propagation for Decentralized Networked Deep Multi-agent Reinforcement Learning
暂无分享,去创建一个
Yuan Qi | Le Song | Shie Mannor | Huan Xu | Chao Qu | Junwu Xiong | Shie Mannor | Le Song | Huan Xu | C. Qu | Yuan Qi | Junwu Xiong
[1] Saeed Ghadimi,et al. Accelerated gradient methods for nonconvex nonlinear and stochastic programming , 2013, Mathematical Programming.
[2] Ali H. Sayed,et al. Diffusion recursive least-squares for distributed estimation over adaptive networks , 2008, IEEE Transactions on Signal Processing.
[3] Sergey Levine,et al. InfoBot: Transfer and Exploration via the Information Bottleneck , 2019, ICLR.
[4] Richard M. Murray,et al. Privacy preserving average consensus , 2014, 53rd IEEE Conference on Decision and Control.
[5] Ian A. Hiskens,et al. Achieving Controllability of Electric Loads , 2011, Proceedings of the IEEE.
[6] Martin Lauer,et al. An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems , 2000, ICML.
[7] Tiejun Huang,et al. Graph Convolutional Reinforcement Learning , 2020, ICLR.
[8] Sonia Martínez,et al. Coverage control for mobile sensing networks , 2002, IEEE Transactions on Robotics and Automation.
[9] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[10] Blaise Agüera y Arcas,et al. Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.
[11] Qing Ling,et al. EXTRA: An Exact First-Order Algorithm for Decentralized Consensus Optimization , 2014, 1404.6264.
[12] Baher Abdulhai,et al. Multiagent Reinforcement Learning for Integrated Network of Adaptive Traffic Signal Controllers (MARLIN-ATSC): Methodology and Large-Scale Application on Downtown Toronto , 2013, IEEE Transactions on Intelligent Transportation Systems.
[13] Zongqing Lu,et al. Graph Convolutional Reinforcement Learning for Multi-Agent Cooperation , 2018, ArXiv.
[14] Zhuoran Yang,et al. Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual Optimization , 2018, NeurIPS.
[15] Mingyi Hong,et al. Decomposing Linearly Constrained Nonconvex Problems by a Proximal Primal Dual Approach: Algorithms, Convergence, and Applications , 2016, ArXiv.
[16] Le Song,et al. SBEED: Convergent Reinforcement Learning with Nonlinear Function Approximation , 2017, ICML.
[17] Mingyi Hong,et al. Prox-PDA: The Proximal Primal-Dual Algorithm for Fast Distributed Nonconvex Optimization and Learning Over Networks , 2017, ICML.
[18] Stephen P. Boyd,et al. A scheme for robust distributed sensor fusion based on average consensus , 2005, IPSN 2005. Fourth International Symposium on Information Processing in Sensor Networks, 2005..
[19] Tamer Basar,et al. Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents , 2018, ICML.
[20] Bart De Schutter,et al. A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[21] Jan Peters,et al. Policy evaluation with temporal differences: a survey and comparison , 2015, J. Mach. Learn. Res..
[22] Ian Goodfellow,et al. Deep Learning with Differential Privacy , 2016, CCS.
[23] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.
[24] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[25] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[26] Dale Schuurmans,et al. Bridging the Gap Between Value and Policy Based Reinforcement Learning , 2017, NIPS.
[27] Serdar Yüksel,et al. Decentralized Q-Learning for Stochastic Teams and Games , 2015, IEEE Transactions on Automatic Control.
[28] Alexander Shapiro,et al. Lectures on Stochastic Programming: Modeling and Theory , 2009 .
[29] Robert Nowak,et al. Distributed optimization in sensor networks , 2004, Third International Symposium on Information Processing in Sensor Networks, 2004. IPSN 2004.
[30] Samy Bengio,et al. Adversarial Machine Learning at Scale , 2016, ICLR.
[31] Victor J. Blue,et al. A COOPERATIVE MULTI-AGENT TRANSPORTATION MANAGEMENT AND ROUTE GUIDANCE SYSTEM , 2002 .
[32] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[33] Richard M. Murray,et al. Information flow and cooperative control of vehicle formations , 2004, IEEE Transactions on Automatic Control.
[34] Csaba Szepesvári,et al. Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path , 2006, Machine Learning.
[35] Rob Fergus,et al. Modeling Others using Oneself in Multi-Agent Reinforcement Learning , 2018, ICML.
[36] Shimon Whiteson,et al. Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.
[37] Michael P. Wellman,et al. Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..
[38] Stephen P. Boyd,et al. Randomized gossip algorithms , 2006, IEEE Transactions on Information Theory.
[39] Shimon Whiteson,et al. QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2018, ICML.
[40] Yee Whye Teh,et al. Distral: Robust multitask reinforcement learning , 2017, NIPS.
[41] M. Degroot. Reaching a Consensus , 1974 .