Action Redundancy in Reinforcement Learning
暂无分享,去创建一个
[1] Sergey Levine,et al. Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.
[2] Benjamin Van Roy,et al. The Linear Programming Approach to Approximate Dynamic Programming , 2003, Oper. Res..
[3] Philip S. Thomas,et al. Learning Action Representations for Reinforcement Learning , 2019, ICML.
[4] Daniel Guo,et al. Never Give Up: Learning Directed Exploration Strategies , 2020, ICLR.
[5] Ronald Ortner,et al. Autonomous exploration for navigating in non-stationary CMPs , 2019, ArXiv.
[6] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[7] Yu Zhang,et al. A Survey on Multi-Task Learning , 2017, IEEE Transactions on Knowledge and Data Engineering.
[8] Patrick Gallinari,et al. Fast Reinforcement Learning with Large Action Sets Using Error-Correcting Output Codes for MDP Factorization , 2012, ECML/PKDD.
[9] Peng-Yeng Yin,et al. Maximum entropy-based optimal threshold selection using deterministic reinforcement learning with controlled randomization , 2002, Signal Process..
[10] Marcello Restelli,et al. Task-Agnostic Exploration via Policy Gradient of a Non-Parametric State Entropy Estimate , 2021, AAAI.
[11] Shie Mannor,et al. Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning , 2018, NeurIPS.
[12] Tor Lattimore,et al. Geometric Entropic Exploration , 2021, ArXiv.
[13] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[14] Andrew G. Barto,et al. Building Portable Options: Skill Transfer in Reinforcement Learning , 2007, IJCAI.
[15] Dale Schuurmans,et al. Bridging the Gap Between Value and Policy Based Reinforcement Learning , 2017, NIPS.
[16] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[17] Shie Mannor,et al. Action Elimination and Stopping Conditions for Reinforcement Learning , 2003, ICML.
[18] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[19] Sham M. Kakade,et al. Provably Efficient Maximum Entropy Exploration , 2018, ICML.
[20] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[21] Peter Auer,et al. Autonomous Exploration For Navigating In MDPs , 2012, COLT.
[22] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[23] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[24] J. Andrew Bagnell,et al. Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy , 2010 .
[25] Jason Pazis,et al. Generalized Value Functions for Large Action Sets , 2011, ICML.
[26] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[27] Marcello Restelli,et al. An Intrinsically-Motivated Approach for Learning Highly Exploring and Fast Mixing Policies , 2019, AAAI.
[28] Gabriel Peyré,et al. Geometric Losses for Distributional Learning , 2019, ICML.
[29] Sergey Levine,et al. Skew-Fit: State-Covering Self-Supervised Reinforcement Learning , 2019, ICML.
[30] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[31] Shie Mannor,et al. The Natural Language of Actions , 2019, ICML.
[32] Robert Babuska,et al. A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[33] Jimmy Ba,et al. Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning , 2020, ICML.
[34] Daan Wierstra,et al. Variational Intrinsic Control , 2016, ICLR.