Artificial Development by Reinforcement Learning Can Benefit From Multiple Motivations
暂无分享,去创建一个
[1] Jürgen Schmidhuber,et al. A possibility for implementing curiosity and boredom in model-building neural controllers , 1991 .
[2] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[3] Joseph A. Paradiso,et al. The gesture recognition toolkit , 2014, J. Mach. Learn. Res..
[4] G. Palm. Novelty, Information and Surprise , 2012, Information Science and Statistics.
[5] N. Chater. Rational and mechanistic perspectives on reinforcement learning , 2009, Cognition.
[6] Joscha Bach,et al. A Framework for Emergent Emotions, Based on Motivation and Cognitive Modulators , 2012, Int. J. Synth. Emot..
[7] Friedhelm Schwenker,et al. Neural Network Ensembles in Reinforcement Learning , 2013, Neural Processing Letters.
[8] Matthew E. Taylor,et al. Multi-objectivization and ensembles of shapings in reinforcement learning , 2017, Neurocomputing.
[9] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[10] Dewen Hu,et al. Multiobjective Reinforcement Learning: A Comprehensive Overview , 2015, IEEE Transactions on Systems, Man, and Cybernetics: Systems.
[11] Hisashi Handa. Solving Multi-objective Reinforcement Learning Problems by EDA-RL - Acquisition of Various Strategies , 2009, 2009 Ninth International Conference on Intelligent Systems Design and Applications.
[12] George G. Lendaris,et al. A retrospective on Adaptive Dynamic Programming for control , 2009, 2009 International Joint Conference on Neural Networks.
[13] P. Dayan,et al. States versus Rewards: Dissociable Neural Prediction Error Signals Underlying Model-Based and Model-Free Reinforcement Learning , 2010, Neuron.
[14] Ann Nowé,et al. Multi-objective reinforcement learning using sets of pareto dominating policies , 2014, J. Mach. Learn. Res..
[15] J. Michael Herrmann,et al. Learning predictive representations , 2000, Neurocomputing.
[16] M.A. Wiering,et al. Computing Optimal Stationary Policies for Multi-Objective Markov Decision Processes , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.
[17] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[18] Ah Chung Tsoi,et al. A fully recursive perceptron network architecture , 2017, 2017 IEEE Symposium Series on Computational Intelligence (SSCI).
[19] Harald Haas,et al. Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication , 2004, Science.
[20] Pierre-Yves Oudeyer,et al. Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.
[21] P. Dayan,et al. Reinforcement learning: The Good, The Bad and The Ugly , 2008, Current Opinion in Neurobiology.
[22] Günther Palm,et al. A neural framework for adaptive robot control , 2009, Neural Computing and Applications.
[23] Nuttapong Chentanez,et al. Intrinsically Motivated Learning of Hierarchical Collections of Skills , 2004 .
[24] Peter Dayan,et al. Exploration bonuses and dual control , 1996 .
[25] Kaisa Miettinen,et al. Nonlinear multiobjective optimization , 1998, International series in operations research and management science.
[26] Andrea Castelletti,et al. Reinforcement learning in the operational management of a water system , 2002 .
[27] Günther Palm,et al. Adaptive Learning in Continuous Environment Using Actor-Critic Design and Echo-State Networks , 2012, SAB.
[28] Petia Koprinkova-Hristova,et al. Heuristic dynamic programming using echo state network as online trainable adaptive critic , 2012 .
[29] Marco Wiering,et al. Reinforcement Learning , 2014, Adaptation, Learning, and Optimization.
[30] Günther Palm,et al. Meta-Learning of Exploration and Exploitation Parameters with Replacing Eligibility Traces , 2013, PSL.
[31] R. Selten,et al. Bounded rationality: The adaptive toolbox , 2000 .
[32] Shimon Whiteson,et al. A Survey of Multi-Objective Sequential Decision-Making , 2013, J. Artif. Intell. Res..
[33] Silvana M. B. Afonso,et al. A modified NBI and NC method for the solution of N-multiobjective optimization problems , 2012 .
[34] R. Rescorla,et al. A theory of Pavlovian conditioning : Variations in the effectiveness of reinforcement and nonreinforcement , 1972 .
[35] T. Maia. Reinforcement learning, conditioning, and the brain: Successes and challenges , 2009, Cognitive, affective & behavioral neuroscience.
[36] John E. Dennis,et al. Normal-Boundary Intersection: A New Method for Generating the Pareto Surface in Nonlinear Multicriteria Optimization Problems , 1998, SIAM J. Optim..
[37] George Cybenko,et al. Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..
[38] Colin Camerer,et al. Introduction : A Brief History of Neuroeconomics , 2008 .
[39] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[40] Sriraam Natarajan,et al. Dynamic preferences in multi-criteria reinforcement learning , 2005, ICML.
[41] Colin Camerer,et al. Introduction: A Brief History of Neuroeconomics , 2009 .
[42] Henry Markram,et al. Real-Time Computing Without Stable States: A New Framework for Neural Computation Based on Perturbations , 2002, Neural Computation.
[43] S. Schaal,et al. Computational motor control in humans and robots , 2005, Current Opinion in Neurobiology.
[44] Marco Wiering,et al. Special issue on multi-objective reinforcement learning , 2017, Neurocomputing.
[45] Peter Vrancx,et al. Reinforcement Learning: State-of-the-Art , 2012 .
[46] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[47] M. Botvinick,et al. Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective , 2009, Cognition.
[48] Günther Palm,et al. Adaptive Critic Design with ESN Critic for Bioprocess Optimization , 2010, ICANN.
[49] Douglas C. Hittle,et al. Robust reinforcement learning control , 2001, Proceedings of the 2001 American Control Conference. (Cat. No.01CH37148).
[50] Wojciech Pisula. Curiosity and Information Seeking in Animal and Human Behavior , 2009 .
[51] Martin A. Riedmiller,et al. Reinforcement learning for robot soccer , 2009, Auton. Robots.
[52] M. Farries,et al. Reinforcement learning with modulated spike timing dependent synaptic plasticity. , 2007, Journal of neurophysiology.
[53] Jürgen Schmidhuber,et al. Exploring the predictable , 2003 .
[54] Friedrich T. Sommer,et al. Learning in embodied action-perception loops through exploration , 2011, ArXiv.
[55] Günther Palm,et al. Real-Time Emotion Recognition from Speech Using Echo State Networks , 2008, ANNPR.
[56] D. Kahneman. Maps of Bounded Rationality: Psychology for Behavioral Economics , 2003 .
[57] Olaf Sporns,et al. Information-Theoretical Aspects of Embodied Artificial Intelligence , 2003, Embodied Artificial Intelligence.
[58] Günther Palm,et al. Adaptive Exploration Using Stochastic Neurons , 2012, ICANN.
[59] David J. C. MacKay,et al. Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.
[60] James L. McClelland,et al. Autonomous Mental Development by Robots and Animals , 2001, Science.
[61] Silvia London. BOUNDED RATIONALITY AND ECONOMIC EVOLUTION , 2000 .
[62] S Ruzikal,et al. SUCCESSIVE APPROACH TO COMPUTE THE BOUNDED PARETO FRONT OF PRACTICAL MULTIOBJECTIVE OPTIMIZATION PROBLEMS , 2009 .
[63] Steve W. C. Chang,et al. Social learning through prediction error in the brain , 2017, npj Science of Learning.
[64] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[65] Kimberly S. Chiew,et al. Positive Affect Versus Reward: Emotional and Motivational Influences on Cognitive Control , 2011, Front. Psychology.
[66] Daniel Polani,et al. Information Theory of Decisions and Actions , 2011 .
[67] Jan Peters,et al. Manifold-based multi-objective policy search with sample reuse , 2017, Neurocomputing.
[68] Craig Boutilier,et al. A POMDP formulation of preference elicitation problems , 2002, AAAI/IAAI.
[69] F. Klix,et al. Bauplan für eine Seele. , 2001 .
[70] Peter Vamplew,et al. Steering approaches to Pareto-optimal multiobjective reinforcement learning , 2017, Neurocomputing.
[71] Martin A. Riedmiller,et al. Reinforcement learning in feedback control , 2011, Machine Learning.
[72] A. G. Butkovskiy,et al. Optimal control of systems , 1966 .
[73] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[74] H. Simon,et al. A Behavioral Model of Rational Choice , 1955 .
[75] Andrew M. Wikenheiser,et al. Over the river, through the woods: cognitive maps in the hippocampus and orbitofrontal cortex , 2016, Nature Reviews Neuroscience.
[76] Giulio Sandini,et al. Developmental robotics: a survey , 2003, Connect. Sci..
[77] Kenji Doya,et al. Finding intrinsic rewards by embodied evolution and constrained reinforcement learning , 2008, Neural Networks.
[78] H. Simon. Bounded Rationality and Organizational Learning , 1991 .
[79] Shimon Whiteson. Pareto Local Policy Search for MOMDP Planning , 2015 .
[80] José Carlos Príncipe,et al. Balancing exploration and exploitation in reinforcement learning using a value of information criterion , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[81] E. Todorov. Optimality principles in sensorimotor control , 2004, Nature Neuroscience.
[82] Peter Dayan,et al. Values and Actions in Aversion , 2009 .
[83] Michael H. Bowling,et al. Actor-Critic Policy Optimization in Partially Observable Multiagent Environments , 2018, NeurIPS.
[84] Evan Dekker,et al. Empirical evaluation methods for multiobjective reinforcement learning algorithms , 2011, Machine Learning.
[85] Uli Weidma. Vergleichende Verhaltensforschung: Grundlagen der Ethologie , 1980 .
[86] Murray S. Davis,et al. That's Interesting! , 1971 .
[87] Petia Koprinkova-Hristova,et al. Learning of embodied interaction dynamics with recurrent neural networks: some exploratory experiments , 2014, Journal of neural engineering.
[88] Manuela Ruiz-Montiel. Multi-objective Reinforcement Learning , 2013 .
[89] Terrence J. Sejnowski,et al. Exploration Bonuses and Dual Control , 1996, Machine Learning.
[90] G. Palm,et al. Multiobjective Reinforcement Learning Using Adaptive Dynamic Programming And Reservoir Computing , 2013 .
[91] Marcello Restelli,et al. Multi-Objective Reinforcement Learning with Continuous Pareto Frontier Approximation , 2014, AAAI.
[92] Wee Chin Wong,et al. A reinforcement learning‐based scheme for direct adaptive optimal control of linear stochastic systems , 2010 .
[93] Jürgen Schmidhuber,et al. Efficient model-based exploration , 1998 .