暂无分享,去创建一个
Tom Schaul | Andrew Sendonaris | Joel Z. Leibo | Olivier Pietquin | Gabriel Dulac-Arnold | Ian Osband | Todd Hester | Marc Lanctot | Bilal Piot | Matej Vecerík | John Agapiou | Audrunas Gruslys | J. Agapiou | T. Schaul | Matej Vecerík | Ian Osband | Bilal Piot | Marc Lanctot | A. Gruslys | Todd Hester | O. Pietquin | A. Sendonaris | Gabriel Dulac-Arnold
[1] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[2] Guy Shani,et al. An MDP-Based Recommender System , 2002, J. Mach. Learn. Res..
[3] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[4] Pieter Abbeel,et al. An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.
[5] Robert E. Schapire,et al. A Game-Theoretic Approach to Apprenticeship Learning , 2007, NIPS.
[6] Michael H. Bowling,et al. Apprenticeship learning using linear programming , 2008, ICML '08.
[7] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[8] Eduardo F. Morales,et al. An Introduction to Reinforcement Learning , 2011 .
[9] Michael L. Littman,et al. Apprenticeship Learning About Multiple Intentions , 2011, ICML.
[10] Sonia Chernova,et al. Integrating reinforcement learning with human demonstrations of varying ability , 2011, AAMAS.
[11] Peter Stone,et al. TEXPLORE: real-time sample-efficient reinforcement learning for robots , 2012, Machine Learning.
[12] Joelle Pineau,et al. Learning from Limited Demonstrations , 2013, NIPS.
[13] Matthieu Geist,et al. Learning from Demonstrations: Is It Worth Estimating a Reward Function? , 2013, ECML/PKDD.
[14] Matthieu Geist,et al. Boosted Bellman Residual Minimization Handling Expert Demonstrations , 2014, ECML/PKDD.
[15] Matthieu Geist,et al. Boosted and reward-regularized classification for apprenticeship learning , 2014, AAMAS.
[16] Alessandro Lazaric,et al. Direct Policy Iteration with Demonstrations , 2015, IJCAI.
[17] Martin A. Riedmiller,et al. Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images , 2015, NIPS.
[18] Andrea Lockerd Thomaz,et al. Policy Shaping with Human Teachers , 2015, IJCAI.
[19] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[20] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.
[21] Sonia Chernova,et al. Reinforcement Learning from Demonstration through Shaping , 2015, IJCAI.
[22] Sergey Levine,et al. Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization , 2016, ICML.
[23] Sonia Chernova,et al. Learning from Demonstration for Shaping through Inverse Reinforcement Learning , 2016, AAMAS.
[24] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[25] David Silver,et al. Learning values across many orders of magnitude , 2016, NIPS.
[26] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[27] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[28] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[29] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[30] Traian Rebedea,et al. Playing Atari Games with Deep Reinforcement Learning and Human Checkpoint Replay , 2016, ArXiv.
[31] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[32] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[33] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[34] Andrea Lockerd Thomaz,et al. Exploration from Demonstration for Interactive Reinforcement Learning , 2016, AAMAS.
[35] Jianfeng Gao,et al. Efficient Exploration for Dialog Policy Learning with Deep BBQ Networks \& Replay Buffer Spiking , 2016, ArXiv.
[36] Marc G. Bellemare,et al. The Reactor: A Sample-Efficient Actor-Critic Architecture , 2017, ArXiv.
[37] Marc G. Bellemare,et al. Count-Based Exploration with Neural Density Models , 2017, ICML.
[38] Marcin Andrychowicz,et al. One-Shot Imitation Learning , 2017, NIPS.
[39] Stefan Schaal,et al. Learning from Demonstration , 1996, NIPS.
[40] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.