暂无分享,去创建一个
Ramana Kumar | Shane Legg | Tom Everitt | Jonathan Uesato | Victoria Krakovna | Richard Ngo | S. Legg | Tom Everitt | Ramana Kumar | Richard Ngo | Victoria Krakovna | J. Uesato
[1] Max Jaderberg,et al. Population Based Training of Neural Networks , 2017, ArXiv.
[2] Soumya Ghosh,et al. Quality of Uncertainty Quantification for Bayesian Neural Network Inference , 2019, ArXiv.
[3] Peter Stone,et al. Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces , 2017, AAAI.
[4] Carlos Celemin,et al. COACH: Learning continuous actions from COrrective Advice Communicated by Humans , 2015, 2015 International Conference on Advanced Robotics (ICAR).
[5] Owain Evans,et al. Trial without Error: Towards Safe Reinforcement Learning via Human Intervention , 2017, AAMAS.
[6] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[7] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[8] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[9] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[10] Laurent Orseau,et al. Delusion, Survival, and Intelligent Agents , 2011, AGI.
[11] Laurent Orseau,et al. Reinforcement Learning with a Corrupted Reward Channel , 2017, IJCAI.
[12] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[13] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[14] Daniel Dewey,et al. Learning What to Value , 2011, AGI.
[15] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[16] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[17] Nando de Freitas,et al. Neural Programmer-Interpreters , 2015, ICLR.
[18] Scott Garrabrant,et al. Risks from Learned Optimization in Advanced Machine Learning Systems , 2019, ArXiv.
[19] Marcus Hutter,et al. Reinforcement learning with value advice , 2014, ACML.
[20] Tom Everitt,et al. Towards Safe Artificial General Intelligence , 2018 .
[21] Krzysztof Z. Gajos,et al. Preference elicitation for interface optimization , 2005, UIST.
[22] Dario Amodei,et al. AI safety via debate , 2018, ArXiv.
[23] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.
[24] Sergey Levine,et al. Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.
[25] Anca D. Dragan,et al. Inverse Reward Design , 2017, NIPS.
[26] Yee Whye Teh,et al. Neural probabilistic motor primitives for humanoid control , 2018, ICLR.
[27] Sergey Levine,et al. Generalizing Skills with Semi-Supervised Reinforcement Learning , 2016, ICLR.
[28] Luca Belli,et al. From Optimizing Engagement to Measuring Value , 2020, ArXiv.
[29] Nick Cammarata,et al. Zoom In: An Introduction to Circuits , 2020 .
[30] Shane Legg,et al. Scalable agent alignment via reward modeling: a research direction , 2018, ArXiv.
[31] P. Stone,et al. TAMER: Training an Agent Manually via Evaluative Reinforcement , 2008, 2008 7th IEEE International Conference on Development and Learning.
[32] Arvind Satyanarayan,et al. The Building Blocks of Interpretability , 2018 .
[33] Alex Graves,et al. Practical Variational Inference for Neural Networks , 2011, NIPS.
[34] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[35] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[36] Anca D. Dragan,et al. Learning Human Objectives by Evaluating Hypothetical Behavior , 2019, ICML.
[37] Rémi Munos,et al. Recurrent Experience Replay in Distributed Reinforcement Learning , 2018, ICLR.
[38] Richard Tanburn,et al. Making Efficient Use of Demonstrations to Solve Hard Exploration Problems , 2019, ICLR.
[39] Shane Legg,et al. The Incentives that Shape Behaviour , 2020, ArXiv.
[40] Peter Stone,et al. Interactively shaping agents via human reinforcement: the TAMER framework , 2009, K-CAP '09.
[41] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[42] Anca D. Dragan,et al. Cooperative Inverse Reinforcement Learning , 2016, NIPS.
[43] Michael L. Littman,et al. Deep Reinforcement Learning from Policy-Dependent Human Feedback , 2019, ArXiv.
[44] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[45] John Schulman,et al. Concrete Problems in AI Safety , 2016, ArXiv.
[46] M ISLEADING META-OBJECTIVES AND HIDDEN INCENTIVES FOR DISTRIBUTIONAL SHIFT , 2019 .
[47] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[48] Chelsea Finn,et al. Language as an Abstraction for Hierarchical Deep Reinforcement Learning , 2019, NeurIPS.
[49] C. Robert. Superintelligence: Paths, Dangers, Strategies , 2017 .
[50] Dario Amodei,et al. Supervising strong learners by amplifying weak experts , 2018, ArXiv.
[51] Stuart Armstrong,et al. Good and safe uses of AI Oracles , 2017, ArXiv.
[52] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[53] Ramana Kumar,et al. Modeling AGI Safety Frameworks with Causal Influence Diagrams , 2019, AISafety@IJCAI.
[54] Marcus Hutter,et al. Reward tampering problems and solutions in reinforcement learning: a causal influence diagram perspective , 2019, Synthese.
[55] Jason Mancuso,et al. Detecting Spiky Corruption in Markov Decision Processes , 2019, AISafety@IJCAI.