Penalizing Side Effects using Stepwise Relative Reachability
暂无分享,去创建一个
Laurent Orseau | Shane Legg | Victoria Krakovna | Miljan Martic | S. Legg | Laurent Orseau | Miljan Martic | Victoria Krakovna
[1] John Schulman,et al. Concrete Problems in AI Safety , 2016, ArXiv.
[2] Christoph Salge,et al. Empowerment - an Introduction , 2013, ArXiv.
[3] Tomás Svoboda,et al. Safe Exploration Techniques for Reinforcement Learning - An Overview , 2014, MESAS.
[4] Shakir Mohamed,et al. Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning , 2015, NIPS.
[5] Yuval Tassa,et al. Safe Exploration in Continuous Action Spaces , 2018, ArXiv.
[6] Alexandre M. Bayen,et al. A time-dependent Hamilton-Jacobi formulation of reachable sets for continuous dynamic games , 2005, IEEE Transactions on Automatic Control.
[7] Jaime F. Fisac,et al. A General Safety Framework for Learning-Based Control in Uncertain Robotic Systems , 2017, IEEE Transactions on Automatic Control.
[8] Tom Schaul,et al. Deep Q-learning From Demonstrations , 2017, AAAI.
[9] Sergey Levine,et al. Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning , 2017, ICLR.
[10] Daan Wierstra,et al. Variational Intrinsic Control , 2016, ICLR.
[11] Ralph Neuneier,et al. Risk-Sensitive Reinforcement Learning , 1998, Machine Learning.
[12] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[13] Edmund H. Durfee,et al. Minimax-Regret Querying on Side Effects for Safe Optimality in Factored Markov Decision Processes , 2018, IJCAI.
[14] Anca D. Dragan,et al. Cooperative Inverse Reinforcement Learning , 2016, NIPS.
[15] Shane Legg,et al. Deep Reinforcement Learning from Human Preferences , 2017, NIPS.
[16] Pieter Abbeel,et al. Safe Exploration in Markov Decision Processes , 2012, ICML.
[17] Dylan Hadfield-Menell,et al. Conservative Agency via Attainable Utility Preservation , 2019, AIES.
[18] Owain Evans,et al. Trial without Error: Towards Safe Reinforcement Learning via Human Intervention , 2017, AAMAS.
[19] Claire J. Tomlin,et al. Guaranteed Safe Online Learning via Reachability: tracking a ground target using a quadrotor , 2012, 2012 IEEE International Conference on Robotics and Automation.
[20] Anca D. Dragan,et al. Preferences Implicit in the State of the World , 2018, ICLR.
[21] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[22] Jianfeng Gao,et al. Combating Reinforcement Learning's Sisyphean Curse with Intrinsic Fear , 2016, ArXiv.
[23] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[24] Shie Mannor,et al. Risk-Sensitive and Robust Decision-Making: a CVaR Optimization Approach , 2015, NIPS.
[25] Javier García,et al. A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..
[26] Klaus Obermayer,et al. Risk-Sensitive Reinforcement Learning , 2013, Neural Computation.
[27] Anca D. Dragan,et al. Inverse Reward Design , 2017, NIPS.
[28] Andreas Krause,et al. Safe Exploration in Finite Markov Decision Processes with Gaussian Processes , 2016, NIPS.
[29] Jessica Taylor,et al. Quantilizers: A Safer Alternative to Maximizers for Limited Optimization , 2016, AAAI Workshop: AI, Ethics, and Society.
[30] Chrystopher L. Nehaniv,et al. All Else Being Equal Be Empowered , 2005, ECAL.
[31] Laurent Orseau,et al. Safely Interruptible Agents , 2016, UAI.
[32] Jessica Taylor,et al. Alignment for Advanced Machine Learning Systems , 2020, Ethics of Artificial Intelligence.
[33] Laurent Orseau,et al. AI Safety Gridworlds , 2017, ArXiv.
[34] Stuart Armstrong,et al. Low Impact Artificial Intelligences , 2017, ArXiv.
[35] John McCarthy,et al. SOME PHILOSOPHICAL PROBLEMS FROM THE STANDPOINT OF ARTI CIAL INTELLIGENCE , 1987 .
[36] Peter Stone,et al. Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces , 2017, AAAI.