Manifold-based multi-objective policy search with sample reuse
暂无分享,去创建一个
[1] Shimon Whiteson,et al. A Survey of Multi-Objective Sequential Decision-Making , 2013, J. Artif. Intell. Res..
[2] Michèle Sebag,et al. Hypervolume indicator and dominance reward based multi-objective Monte-Carlo Tree Search , 2013, Machine Learning.
[3] Csaba Szepesvári,et al. Multi-criteria Reinforcement Learning , 1998, ICML.
[4] Naoyuki Kubota,et al. Local episode-based learning of multi-objective behavior coordination for a mobile robot in dynamic environments , 2003, The 12th IEEE International Conference on Fuzzy Systems, 2003. FUZZ '03..
[5] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[6] Tom Schaul,et al. Natural Evolution Strategies , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).
[7] Marcello Restelli,et al. A multiobjective reinforcement learning approach to water resources systems operation: Pareto frontier approximation in a single run , 2013 .
[8] Doina Precup,et al. Eligibility Traces for Off-Policy Policy Evaluation , 2000, ICML.
[9] Jan Peters,et al. Learning concurrent motor skills in versatile solution spaces , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[10] Machine Learning of Motor Skills for Robotics, Jan Peters , 2022 .
[11] Andrew G. Barto,et al. Elevator Group Control Using Multiple Reinforcement Learning Agents , 1998, Machine Learning.
[12] Shie Mannor,et al. A Geometric Approach to Multi-Criterion Reinforcement Learning , 2004, J. Mach. Learn. Res..
[13] Gary B. Lamont,et al. Evolutionary Algorithms for Solving Multi-Objective Problems (Genetic and Evolutionary Computation) , 2006 .
[14] Ann Nowé,et al. Scalarized multi-objective reinforcement learning: Novel design techniques , 2013, 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[15] Marcello Restelli,et al. Multi-Objective Reinforcement Learning with Continuous Pareto Frontier Approximation , 2014, AAAI.
[16] Shun-ichi Amari,et al. Why natural gradient? , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[17] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .
[18] Kalyanmoy Deb,et al. A Fast Elitist Non-dominated Sorting Genetic Algorithm for Multi-objective Optimisation: NSGA-II , 2000, PPSN.
[19] Andrei V. Kelarev,et al. Constructing Stochastic Mixture Policies for Episodic Multiobjective Reinforcement Learning Tasks , 2009, Australasian Conference on Artificial Intelligence.
[20] Sriraam Natarajan,et al. Dynamic preferences in multi-criteria reinforcement learning , 2005, ICML.
[21] David Levine,et al. Managing Power Consumption and Performance of Computing Systems Using Reinforcement Learning , 2007, NIPS.
[22] Susan A. Murphy,et al. Linear fitted-Q iteration with multiple reward functions , 2013, J. Mach. Learn. Res..
[23] Srini Narayanan,et al. Learning all optimal policies with multiple criteria , 2008, ICML '08.
[24] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[25] Gang Niu,et al. Analysis and Improvement of Policy Gradient Estimation , 2011, NIPS.
[26] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[27] Andrea Castelletti,et al. Tree-based Fitted Q-iteration for Multi-Objective Markov Decision problems , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).
[28] Christian R. Shelton,et al. Importance sampling for reinforcement learning with multiple objectives , 2001 .
[29] Susan A. Murphy,et al. Efficient Reinforcement Learning with Multiple Reward Functions for Randomized Controlled Trial Analysis , 2010, ICML.
[30] Gary B. Lamont,et al. Evolutionary Algorithms for Solving Multi-Objective Problems , 2002, Genetic Algorithms and Evolutionary Computation.
[31] Isao Ono,et al. Local Search for Multiobjective Function Optimization: Pareto Descent Method , 2006 .
[32] Luca Bascetta,et al. Policy gradient approaches for multi-objective sequential decision making , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).
[33] Andrea Castelletti,et al. Reinforcement learning in the operational management of a water system , 2002 .
[34] Shie Mannor,et al. The Steering Approach for Multi-Criteria Reinforcement Learning , 2001, NIPS.
[35] Stefan Roth,et al. Covariance Matrix Adaptation for Multi-objective Optimization , 2007, Evolutionary Computation.
[36] A. Owen,et al. Safe and Effective Importance Sampling , 2000 .
[37] Tom Schaul,et al. Efficient natural evolution strategies , 2009, GECCO.
[38] Nicola Beume,et al. SMS-EMOA: Multiobjective selection based on dominated hypervolume , 2007, Eur. J. Oper. Res..
[39] Jan Peters,et al. Reinforcement learning vs human programming in tetherball robot games , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[40] Evan Dekker,et al. Empirical evaluation methods for multiobjective reinforcement learning algorithms , 2011, Machine Learning.
[41] Martha White,et al. Linear Off-Policy Actor-Critic , 2012, ICML.
[42] Jun Morimoto,et al. Efficient Sample Reuse in Policy Gradients with Parameter-Based Exploration , 2012, Neural Computation.
[43] P. Papalambros,et al. A NOTE ON WEIGHTED CRITERIA METHODS FOR COMPROMISE SOLUTIONS IN MULTI-OBJECTIVE OPTIMIZATION , 1996 .
[44] Jan Peters,et al. A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.
[45] Lothar Thiele,et al. The Hypervolume Indicator Revisited: On the Design of Pareto-compliant Indicators Via Weighted Integration , 2007, EMO.
[46] Marco Laumanns,et al. Performance assessment of multiobjective optimizers: an analysis and review , 2003, IEEE Trans. Evol. Comput..
[47] J. Dennis,et al. A closer look at drawbacks of minimizing weighted sums of objectives for Pareto set generation in multicriteria optimization problems , 1997 .
[48] Darwin G. Caldwell,et al. Multi-objective reinforcement learning for AUV thruster failure recovery , 2014, 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[49] Kalyanmoy Deb,et al. A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..
[50] Tom Schaul,et al. Exploring parameter space in reinforcement learning , 2010, Paladyn J. Behav. Robotics.
[51] Jun Nakanishi,et al. Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.