暂无分享,去创建一个
Samy Bengio | Rémi Munos | Oriol Vinyals | Chiyuan Zhang | Oriol Vinyals | R. Munos | Samy Bengio | Chiyuan Zhang
[1] Yishay Mansour,et al. Approximate Planning in Large POMDPs via Reusable Trajectories , 1999, NIPS.
[2] Peter Auer,et al. Using upper confidence bounds for online learning , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.
[3] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[4] Erica Moodie,et al. Dynamic treatment regimes. , 2004, Clinical trials.
[5] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[6] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[7] Lihong Li,et al. Reinforcement Learning in Finite MDPs: PAC Analysis , 2009, J. Mach. Learn. Res..
[8] Dale Schuurmans,et al. Learning Exercise Policies for American Options , 2009, AISTATS.
[9] Yaoliang Yu,et al. A General Projection Property for Distribution Families , 2009, NIPS.
[10] Ronald E. Parr,et al. A Novel Benchmark Methodology and Data Repository for Real-life Reinforcement Learning , 2009 .
[11] Shimon Whiteson,et al. Protecting against evaluation overfitting in empirical reinforcement learning , 2011, 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[12] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..
[13] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[14] Joan Bruna,et al. Signal recovery from Pooling Representations , 2013, ICML.
[15] John Langford,et al. Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits , 2014, ICML.
[16] Tor Lattimore,et al. Near-optimal PAC bounds for discounted MDPs , 2014, Theor. Comput. Sci..
[17] Zheng Wen,et al. Optimal Demand Response Using Device-Based Reinforcement Learning , 2014, IEEE Transactions on Smart Grid.
[18] Peter Stone,et al. The Impact of Determinism on Learning Atari 2600 Games , 2015, AAAI Workshop: Learning for General Competency in Video Games.
[19] Shane Legg,et al. Massively Parallel Methods for Deep Reinforcement Learning , 2015, ArXiv.
[20] Elliot Meyerson,et al. Frame Skip Is a Powerful Parameter for Learning to Play Atari , 2015, AAAI Workshop: Learning for General Competency in Video Games.
[21] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[22] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.
[23] M. Kosorok,et al. Adaptive Treatment Strategies in Practice: Planning Trials and Analyzing Data for Personalized Medicine , 2015 .
[24] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[25] Yoram Singer,et al. Train faster, generalize better: Stability of stochastic gradient descent , 2015, ICML.
[26] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Frans A. Oliehoek,et al. Coordinated Deep Reinforcement Learners for Traffic Light Control , 2016 .
[28] Lorenzo Rosasco,et al. Generalization Properties and Implicit Regularization for Multiple Passes SGM , 2016, ICML.
[29] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[30] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[31] Jim Duggan,et al. An Experimental Review of Reinforcement Learning Algorithms for Adaptive Traffic Signal Control , 2016, Autonomic Road Transport Support Systems.
[32] Nan Jiang,et al. Contextual Decision Processes with low Bellman rank are PAC-Learnable , 2016, ICML.
[33] Demis Hassabis,et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.
[34] Kibok Lee,et al. Towards Understanding the Invertibility of Convolutional Neural Networks , 2017, IJCAI.
[35] Gintare Karolina Dziugaite,et al. Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data , 2017, UAI.
[36] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[37] Tor Lattimore,et al. Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning , 2017, NIPS.
[38] Tom Schaul,et al. StarCraft II: A New Challenge for Reinforcement Learning , 2017, ArXiv.
[39] Rémi Munos,et al. Minimax Regret Bounds for Reinforcement Learning , 2017, ICML.
[40] Damien Ernst,et al. Reinforcement Learning for Electric Power System Decision and Control: Past Considerations and Perspectives , 2017 .
[41] Matus Telgarsky,et al. Spectrally-normalized margin bounds for neural networks , 2017, NIPS.
[42] Quoc V. Le,et al. Understanding Generalization and Stochastic Gradient Descent , 2017 .
[43] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[44] Sandy H. Huang,et al. Adversarial Attacks on Neural Network Policies , 2017, ICLR.
[45] Nathan Srebro,et al. Exploring Generalization in Deep Learning , 2017, NIPS.
[46] Youyong Kong,et al. Deep Direct Reinforcement Learning for Financial Signal Representation and Trading , 2017, IEEE Transactions on Neural Networks and Learning Systems.
[47] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.
[48] Marlos C. Machado,et al. Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents , 2017, J. Artif. Intell. Res..
[49] Jacob Andreas,et al. Can Deep Reinforcement Learning Solve Erdos-Selfridge-Spencer Games? , 2017, ICML.
[50] Quoc V. Le,et al. A Bayesian Perspective on Generalization and Stochastic Gradient Descent , 2017, ICLR.
[51] Tomaso A. Poggio,et al. Fisher-Rao Metric, Geometry, and Complexity of Neural Networks , 2017, AISTATS.
[52] Andrew M. Saxe,et al. High-dimensional dynamics of generalization error in neural networks , 2017, Neural Networks.