A novel multi-step Q-learning method to improve data efficiency for deep reinforcement learning
暂无分享,去创建一个
Wei Wu | Zhu Liang Yu | Zhenghui Gu | Yao Yeboah | Yinlong Yuan | Yuanqing Li | Jingcong Li | Xiaoyan Deng | Z. Yu | Z. Gu | Yuanqing Li | Wei Wu | Jingcong Li | Xiaoyan Deng | Yinlong Yuan | Yao Yeboah
[1] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[2] Hui Gao,et al. A proactive decision support method based on deep reinforcement learning and state partition , 2017, Knowl. Based Syst..
[3] Kenji Doya,et al. Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning , 2017, Neural Networks.
[4] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[5] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.
[6] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[7] José David Martín-Guerrero,et al. Online fitted policy iteration based on extreme learning machines , 2016, Knowl. Based Syst..
[8] Hado van Hasselt,et al. Double Q-learning , 2010, NIPS.
[9] Richard S. Sutton,et al. Reinforcement learning architectures for animats , 1991 .
[10] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[11] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.
[12] Glen Berseth,et al. Terrain-adaptive locomotion skills using deep reinforcement learning , 2016, ACM Trans. Graph..
[13] Jürgen Schmidhuber,et al. Deep learning in neural networks: An overview , 2014, Neural Networks.
[14] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[15] Jan Peters,et al. Reinforcement Learning to Adjust Robot Movements to New Situations , 2010, IJCAI.
[16] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[17] Shimon Whiteson,et al. A theoretical and empirical analysis of Expected Sarsa , 2009, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning.
[18] John N. Tsitsiklis,et al. Analysis of Temporal-Diffference Learning with Function Approximation , 1996, NIPS.
[19] Ai Poh Loh,et al. Model-based contextual policy search for data-efficient generalization of robot skills , 2017, Artif. Intell..
[20] Jing Peng,et al. Incremental multi-step Q-learning , 1994, Machine Learning.
[21] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[22] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[23] Dan Xia,et al. Learning classifier system with average reward reinforcement learning , 2013, Knowl. Based Syst..
[24] Raquel Dormido,et al. Tackling the start-up of a reinforcement learning agent for the control of wastewater treatment plants , 2017, Knowl. Based Syst..
[25] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[26] Carl E. Rasmussen,et al. Gaussian Processes for Data-Efficient Learning in Robotics and Control , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[27] Stefano Palminteri,et al. The Computational Development of Reinforcement Learning during Adolescence , 2016, PLoS Comput. Biol..
[28] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[29] Marc Peter Deisenroth,et al. Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.
[30] Daniel Marco,et al. Markov Random Processes Are Neither Bandlimited nor Recoverable From Samples or After Quantization , 2009, IEEE Transactions on Information Theory.