Learning values across many orders of magnitude
暂无分享,去创建一个
David Silver | Matteo Hessel | Arthur Guez | Volodymyr Mnih | Hado van Hasselt | D. Silver | A. Guez | Volodymyr Mnih | Matteo Hessel | H. V. Hasselt | David Silver
[1] A. A. Mullin,et al. Principles of neurodynamics , 1962 .
[2] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[3] W. Newey,et al. Asymmetric Least Squares Estimation and Testing , 1987 .
[4] W S McCulloch,et al. A logical calculus of the ideas immanent in nervous activity , 1990, The Philosophy of Artificial Intelligence.
[5] Richard S. Sutton,et al. Adapting Bias by Gradient Descent: An Incremental Version of Delta-Bar-Delta , 1992, AAAI.
[6] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[7] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[8] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.
[9] Sepp Hochreiter,et al. The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions , 1998, Int. J. Uncertain. Fuzziness Knowl. Based Syst..
[10] H. Kushner,et al. Stochastic Approximation and Recursive Algorithms and Applications , 2003 .
[11] H. Robbins. A Stochastic Approximation Method , 1951 .
[12] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[13] Hado van Hasselt,et al. Double Q-learning , 2010, NIPS.
[14] Yoshua Bengio,et al. Algorithms for Hyper-Parameter Optimization , 2011, NIPS.
[15] Yoshua Bengio,et al. Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..
[16] Jasper Snoek,et al. Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.
[17] Patrick M. Pilarski,et al. Tuning-free step-size adaptation , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] John Langford,et al. Normalized Online Learning , 2013, UAI.
[19] Tom Schaul,et al. No more pesky learning rates , 2012, ICML.
[20] Jürgen Schmidhuber,et al. Deep learning in neural networks: An overview , 2014, Neural Networks.
[21] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[22] Razvan Pascanu,et al. Natural Neural Networks , 2015, NIPS.
[23] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.
[24] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[25] Roger B. Grosse,et al. Optimizing Neural Networks with Kronecker-factored Approximate Curvature , 2015, ICML.
[26] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[27] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.
[28] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[29] Marlos C. Machado,et al. State of the Art Control of Atari Games Using Shallow Reinforcement Learning , 2015, AAMAS.
[30] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[31] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[32] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[33] David Silver,et al. Learning functions across many orders of magnitudes , 2016, ArXiv.
[34] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[35] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[36] Marc G. Bellemare,et al. Increasing the Action Gap: New Operators for Reinforcement Learning , 2015, AAAI.