-
爱吃猫的鱼0At June 15, 2022, 1:05 a.m.
Xi Chen | Tim Salimans | Ilya Sutskever | Jonathan Ho | Ilya Sutskever | Xi Chen | Tim Salimans | Jonathan Ho | I. Sutskever
[1] Ingo Rechenberg,et al. Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , 1973 .
[2] H. P. Schwefel,et al. Numerische Optimierung von Computermodellen mittels der Evo-lutionsstrategie , 1977 .
[3] J. Geweke,et al. Antithetic acceleration of Monte Carlo integration in Bayesian inference , 1988 .
[4] J. Spall. Multivariate stochastic approximation using a simultaneous perturbation gradient approximation , 1992 .
[5] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[6] Jieyu Zhao,et al. Direct Policy Search and Uncertain Policy Evaluation , 1998 .
[7] Nikolaus Hansen,et al. Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.
[8] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[9] Ben Tse,et al. Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.
[10] Jürgen Schmidhuber,et al. Training Recurrent Networks by Evolino , 2007, Neural Computation.
[11] Tom Schaul,et al. Natural Evolution Strategies , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).
[12] Tom Schaul,et al. Efficient natural evolution strategies , 2009, GECCO.
[13] Kenneth O. Stanley,et al. A Hypercube-Based Encoding for Evolving Large-Scale Neural Networks , 2009, Artificial Life.
[14] Tom Schaul,et al. Stochastic search using the natural gradient , 2009, ICML '09.
[15] Anne Auger,et al. Mirrored Sampling and Sequential Selection for Evolution Strategies , 2010, PPSN.
[16] Jürgen Schmidhuber,et al. Evolving neural networks in compressed weight space , 2010, GECCO '10.
[17] Tom Schaul,et al. Exponential natural evolution strategies , 2010, GECCO '10.
[18] Tom Schaul,et al. A Natural Evolution Strategy for Multi-objective Optimization , 2010, PPSN.
[19] Frank Sehnke,et al. Parameter-exploring policy gradients , 2010, Neural Networks.
[20] Tom Schaul,et al. High dimensions and heavy tails for natural evolution strategies , 2011, GECCO '11.
[21] Yurii Nesterov,et al. Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems , 2012, SIAM J. Optim..
[22] Jürgen Schmidhuber,et al. Generalized compressed network search , 2012, GECCO '12.
[23] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[24] Olivier Sigaud,et al. Policy Improvement Methods: Between Black-Box Optimization and Episodic Reinforcement Learning , 2012 .
[25] Jürgen Schmidhuber,et al. Evolving large-scale neural networks for vision-based reinforcement learning , 2013, GECCO '13.
[26] Risto Miikkulainen,et al. A Neuroevolution Approach to General Atari Game Playing , 2014, IEEE Transactions on Computational Intelligence and AI in Games.
[27] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[28] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[29] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[30] Martin J. Wainwright,et al. Optimal Rates for Zero-Order Convex Optimization: The Power of Two Function Evaluations , 2013, IEEE Transactions on Information Theory.
[31] Elliot Meyerson,et al. Frame Skip Is a Powerful Parameter for Learning to Play Atari , 2015, AAAI Workshop: Learning for General Competency in Video Games.
[32] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[33] Kenji Kawaguchi,et al. Deep Learning without Poor Local Minima , 2016, NIPS.
[34] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[35] Wojciech Zaremba,et al. Improved Techniques for Training GANs , 2016, NIPS.
[36] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[37] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[38] Peter L. Bartlett,et al. RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning , 2016, ArXiv.
[39] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[40] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[41] Nicolas Usunier,et al. Episodic Exploration for Deep Deterministic Policies: An Application to StarCraft Micromanagement Tasks , 2016, ArXiv.
[42] Jürgen Schmidhuber,et al. A Wavelet-based Encoding for Neuroevolution , 2016, GECCO.
[43] Yurii Nesterov,et al. Random Gradient-Free Minimization of Convex Functions , 2015, Foundations of Computational Mathematics.
[44] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[45] Julian Togelius,et al. Neuroevolution in Games: State of the Art and Open Challenges , 2014, IEEE Transactions on Computational Intelligence and AI in Games.
[46] Kocsis Zoltán Tamás,et al. IEEE World Congress on Computational Intelligence , 2019, IEEE Computational Intelligence Magazine.