暂无分享,去创建一个
Misha Denil | Nando de Freitas | Alexander Novikov | Ksenia Konyushkova | Ziyu Wang | Scott Reed | Yusuf Aytar | Caglar Gulcehre | Scott E. Reed | Konrad Żołna | Caglar Gulcehre | Ziyun Wang | N. D. Freitas | Misha Denil | Y. Aytar | Alexander Novikov | Ksenia Konyushkova | Konrad Zolna
[1] Che Wang,et al. BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement Learning , 2020, NeurIPS.
[2] Gang Niu,et al. Convex Formulation for Learning from Positive and Unlabeled Data , 2015, ICML.
[3] Sergey Levine,et al. Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning , 2019, ArXiv.
[4] Nando de Freitas,et al. Critic Regularized Regression , 2020, NeurIPS.
[5] Dean Pomerleau,et al. ALVINN, an autonomous land vehicle in a neural network , 2015 .
[6] Anca D. Dragan,et al. SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards , 2019, ICLR.
[7] Martin A. Riedmiller,et al. Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards , 2017, ArXiv.
[8] Martin A. Riedmiller,et al. Keep Doing What Worked: Behavioral Modelling Priors for Offline Reinforcement Learning , 2020, ICLR.
[9] Justin Fu,et al. D4RL: Datasets for Deep Data-Driven Reinforcement Learning , 2020, ArXiv.
[10] Charles Elkan,et al. Learning classifiers from only positive and unlabeled data , 2008, KDD.
[11] Nando de Freitas,et al. RL Unplugged: Benchmarks for Offline Reinforcement Learning , 2020, ArXiv.
[12] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[13] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[14] Doina Precup,et al. Off-Policy Deep Reinforcement Learning without Exploration , 2018, ICML.
[15] Marcin Andrychowicz,et al. Overcoming Exploration in Reinforcement Learning with Demonstrations , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[16] Stefano Ermon,et al. InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations , 2017, NIPS.
[17] Misha Denil,et al. Positive-Unlabeled Reward Learning , 2019, CoRL.
[18] Nando de Freitas,et al. Reinforcement and Imitation Learning for Diverse Visuomotor Skills , 2018, Robotics: Science and Systems.
[19] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[20] Sergey Levine,et al. Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization , 2016, ICML.
[21] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[22] Oleg O. Sushkov,et al. Scaling data-driven robotics with reward sketching and batch reinforcement learning , 2019, Robotics: Science and Systems.
[23] Sergey Levine,et al. Learning Robust Rewards with Adversarial Inverse Reinforcement Learning , 2017, ICLR 2017.
[24] Richard Tanburn,et al. Making Efficient Use of Demonstrations to Solve Hard Exploration Problems , 2019, ICLR.
[25] Ilya Kostrikov,et al. Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward Bias in Adversarial Imitation Learning , 2018, ICLR.
[26] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[27] Yoshua Bengio,et al. Reinforced Imitation in Heterogeneous Action Space , 2019, ArXiv.
[28] Shie Mannor,et al. End-to-End Differentiable Adversarial Imitation Learning , 2017, ICML.
[29] Yuval Tassa,et al. Learning human behaviors from motion capture by adversarial imitation , 2017, ArXiv.
[30] Matthew W. Hoffman,et al. Distributed Distributional Deterministic Policy Gradients , 2018, ICLR.
[31] Gang Niu,et al. Positive-Unlabeled Learning with Non-Negative Risk Estimator , 2017, NIPS.
[32] Oleg O. Sushkov,et al. A Practical Approach to Insertion with Variable Socket Position Using Deep Reinforcement Learning , 2018, 2019 International Conference on Robotics and Automation (ICRA).
[33] Martin A. Riedmiller,et al. Batch Reinforcement Learning , 2012, Reinforcement Learning.
[34] Sergey Levine,et al. Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations , 2017, Robotics: Science and Systems.
[35] Yoshua Bengio,et al. Combating False Negatives in Adversarial Imitation Learning , 2020, 2021 International Joint Conference on Neural Networks (IJCNN).
[36] Nando de Freitas,et al. Hyperparameter Selection for Offline Reinforcement Learning , 2020, ArXiv.
[37] Misha Denil,et al. Task-Relevant Adversarial Imitation Learning , 2019, CoRL.
[38] Rémi Munos,et al. Observe and Look Further: Achieving Consistent Performance on Atari , 2018, ArXiv.
[39] Qing Wang,et al. Exponentially Weighted Imitation Learning for Batched Historical Data , 2018, NeurIPS.
[40] Pieter Abbeel,et al. An Algorithmic Perspective on Imitation Learning , 2018, Found. Trends Robotics.
[41] Sergey Levine,et al. Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems , 2020, ArXiv.