The Alignment Problem for Bayesian History-Based Reinforcement Learners∗
暂无分享,去创建一个
[1] Marcus Hutter,et al. Self-Modification of Policy and Utility Function in Rational Agents , 2016, AGI.
[2] M. Schervish,et al. State-Dependent Utilities , 1990 .
[3] Mark O. Riedl,et al. Using Stories to Teach Human Values to Artificial Agents , 2016, AAAI Workshop: AI, Ethics, and Society.
[4] Laurent Orseau,et al. AI Safety Gridworlds , 2017, ArXiv.
[5] Laurent Orseau,et al. Safely Interruptible Agents , 2016, UAI.
[6] Jürgen Schmidhuber,et al. Goedel Machines: Self-Referential Universal Problem Solvers Making Provably Optimal Self-Improvements , 2003, ArXiv.
[7] Jon Bird,et al. The evolved radio and its implications for modelling the evolution of novel sensors , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).
[8] Laurent Orseau,et al. Universal knowledge-seeking agents , 2011, Theor. Comput. Sci..
[9] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[10] Jürgen Schmidhuber,et al. World Models , 2018, ArXiv.
[11] Laurent Orseau,et al. Delusion, Survival, and Intelligent Agents , 2011, AGI.
[12] James L Olds,et al. Positive reinforcement produced by electrical stimulation of septal area and other regions of rat brain. , 1954, Journal of comparative and physiological psychology.
[13] G. Smith. ANARCHY, STATE, AND UTOPIA , 1976 .
[14] Stuart Armstrong,et al. 'Indifference' methods for managing agent rewards , 2017, ArXiv.
[15] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[16] Justin A. Boyan,et al. Least-Squares Temporal Difference Learning , 1999, ICML.
[17] Marcus Hutter,et al. A Causal Influence Diagram Perspective , 2019 .
[18] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[19] N Wiener,et al. Some moral and technical consequences of automation , 1960, Science.
[20] Roman V. Yampolskiy,et al. Artificial Superintelligence: A Futuristic Approach , 2015 .
[21] Marcus Hutter,et al. Avoiding Wireheading with Value Reinforcement Learning , 2016, AGI.
[22] John Schulman,et al. Concrete Problems in AI Safety , 2016, ArXiv.
[23] J. Schreiber. Foundations Of Statistics , 2016 .
[24] Shane Legg,et al. Deep Reinforcement Learning from Human Preferences , 2017, NIPS.
[25] Marcus Hutter. A Complete Theory of Everything (Will Be Subjective) , 2010, Algorithms.
[26] M. Strathern. ‘Improving ratings’: audit in the British University system , 1997, European Review.
[27] Pieter Abbeel,et al. An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.
[28] Risto Miikkulainen,et al. The Surprising Creativity of Digital Evolution: A Collection of Anecdotes from the Evolutionary Computation and Artificial Life Research Communities , 2018, Artificial Life.
[29] Anca D. Dragan,et al. Cooperative Inverse Reinforcement Learning , 2016, NIPS.
[30] C. Robert. Superintelligence: Paths, Dangers, Strategies , 2017 .
[31] David A. Rottenberg,et al. Compulsive thalamic self-stimulation: A case with metabolic, electrophysiologic and behavioral correlates , 1986, Pain.
[32] Bill Hibbard,et al. Model-based Utility Functions , 2011, J. Artif. Gen. Intell..
[33] Owain Evans,et al. Trial without Error: Towards Safe Reinforcement Learning via Human Intervention , 2017, AAMAS.
[34] Demis Hassabis,et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.
[35] J. Pearl. Causality: Models, Reasoning and Inference , 2000 .
[36] Stephen Omohundro,et al. The Nature of Self-Improving Artificial Intelligence , 2008 .
[37] Tor Lattimore,et al. General time consistent discounting , 2014, Theor. Comput. Sci..
[38] Marcus Hutter,et al. Reward tampering problems and solutions in reinforcement learning: a causal influence diagram perspective , 2019, Synthese.