Shallow Updates for Deep Reinforcement Learning
暂无分享,去创建一个
Shie Mannor | Nir Levine | Aviv Tamar | Tom Zahavy | Daniel J. Mankowitz | Aviv Tamar | Shie Mannor | D. Mankowitz | Tom Zahavy | Nir Levine
[1] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[2] Shie Mannor,et al. Regularized Policy Iteration , 2008, NIPS.
[3] Tom Schaul,et al. Learning from Demonstrations for Real World Reinforcement Learning , 2017, ArXiv.
[4] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[5] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[6] Yann LeCun,et al. What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[7] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[8] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[9] Vivek F. Farias,et al. Approximate Dynamic Programming via a Smoothed Linear Program , 2009, Oper. Res..
[10] Shie Mannor,et al. A Deep Hierarchical Approach to Lifelong Learning in Minecraft , 2016, AAAI.
[11] Martin A. Riedmiller. Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.
[12] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.
[13] Joel Z. Leibo,et al. Model-Free Episodic Control , 2016, ArXiv.
[14] Razvan Pascanu,et al. Sharp Minima Can Generalize For Deep Nets , 2017, ICML.
[15] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[16] Elad Hoffer,et al. Train longer, generalize better: closing the generalization gap in large batch training of neural networks , 2017, NIPS.
[17] Trevor Darrell,et al. DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.
[18] G. C. Tiao,et al. Bayesian inference in statistical analysis , 1973 .
[19] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[20] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[21] Shie Mannor,et al. Graying the black box: Understanding DQNs , 2016, ICML.
[22] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[23] Matthieu Geist,et al. Approximate modified policy iteration and its application to the game of Tetris , 2015, J. Mach. Learn. Res..
[24] Razvan Pascanu,et al. Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.
[25] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[26] Shie Mannor,et al. Bayesian Reinforcement Learning: A Survey , 2015, Found. Trends Mach. Learn..
[27] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[28] Dimitri P. Bertsekas,et al. Approximate Dynamic Programming , 2017, Encyclopedia of Machine Learning and Data Mining.
[29] Andrew Y. Ng,et al. Regularization and feature selection in least-squares temporal difference learning , 2009, ICML '09.
[30] Andrew G. Barto,et al. Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.
[31] Lawrence Carin,et al. Linear Feature Encoding for Reinforcement Learning , 2016, NIPS.
[32] John N. Tsitsiklis,et al. Analysis of Temporal-Diffference Learning with Function Approximation , 1996, NIPS.
[33] Marlos C. Machado,et al. State of the Art Control of Atari Games Using Shallow Reinforcement Learning , 2015, AAMAS.
[34] Tom Schaul,et al. Deep Q-learning From Demonstrations , 2017, AAAI.
[35] F. Wilcoxon. Individual Comparisons by Ranking Methods , 1945 .