Inverse Reinforcement Learning via Nonparametric Spatio-Temporal Subgoal Modeling
暂无分享,去创建一个
Heinz Koeppl | Abdelhak M. Zoubir | Jan Peters | Elmar Rueckert | Adrian Sosic | Jan Peters | A. Zoubir | Elmar Rueckert | Adrian Šošić | H. Koeppl
[1] Christos Dimitrakakis,et al. Bayesian Multitask Inverse Reinforcement Learning , 2011, EWRL.
[2] Pravesh Ranchod,et al. Nonparametric Bayesian reward segmentation for skill discovery using inverse reinforcement learning , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[3] Jun Nakanishi,et al. Learning Movement Primitives , 2005, ISRR.
[4] Christos Dimitrakakis,et al. Preference elicitation and inverse reinforcement learning , 2011, ECML/PKDD.
[5] Csaba Szepesvári,et al. Apprenticeship Learning using Inverse Reinforcement Learning and Gradient Methods , 2007, UAI.
[6] Marc Toussaint,et al. Learned graphical models for probabilistic planning provide a new class of movement primitives , 2013, Front. Comput. Neurosci..
[7] Claudio Gentile,et al. Boltzmann Exploration Done Right , 2017, NIPS.
[8] Jan Peters,et al. Hierarchical Relative Entropy Policy Search , 2014, AISTATS.
[9] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..
[10] Doina Precup,et al. Learning Options in Reinforcement Learning , 2002, SARA.
[11] Michael O. Duff,et al. Reinforcement Learning Methods for Continuous-Time Markov Decision Problems , 1994, NIPS.
[12] Sanjay Krishnan,et al. HIRL: Hierarchical Inverse Reinforcement Learning for Long-Horizon Tasks with Delayed Rewards , 2016, ArXiv.
[13] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[14] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[15] Jonathan P. How,et al. Bayesian Nonparametric Inverse Reinforcement Learning , 2012, ECML/PKDD.
[16] Amit Surana,et al. Bayesian Nonparametric Inverse Reinforcement Learning for Switched Markov Decision Processes , 2014, 2014 13th International Conference on Machine Learning and Applications.
[17] Nicholas J. Foti,et al. A Survey of Non-Exchangeable Priors for Bayesian Nonparametric Models , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[18] C. D. Gelatt,et al. Optimization by Simulated Annealing , 1983, Science.
[19] Bruce M. Kapron,et al. Dynamic graph connectivity in polylogarithmic worst case time , 2013, SODA.
[20] Heinz Koeppl,et al. Inverse Reinforcement Learning via Nonparametric Subgoal Modeling , 2018, AAAI Spring Symposia.
[21] Eyal Amir,et al. Bayesian Inverse Reinforcement Learning , 2007, IJCAI.
[22] G. Roberts,et al. Updating Schemes, Correlation Structure, Blocking and Parameterization for the Gibbs Sampler , 1997 .
[23] Er Meng Joo,et al. A survey of inverse reinforcement learning techniques , 2012 .
[24] Scott Niekum,et al. Learning and generalization of complex tasks from unstructured demonstrations , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[25] Jan Peters,et al. Probabilistic inference for determining options in reinforcement learning , 2016, Machine Learning.
[26] Shie Mannor,et al. Bayesian Reinforcement Learning: A Survey , 2015, Found. Trends Mach. Learn..
[27] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..
[28] Heinz Koeppl,et al. A Bayesian Approach to Policy Recognition and State Representation Learning , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[29] Ambuj Tewari,et al. Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs , 2007, NIPS.
[30] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[31] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[32] Xiaodong Li,et al. Learning Options for an MDP from Demonstrations , 2015, ACALCI.
[33] Jan Peters,et al. Learning movement primitive libraries through probabilistic segmentation , 2017, Int. J. Robotics Res..
[34] Piotr J. Gmytrasiewicz,et al. Interactive POMDPs with finite-state models of other agents , 2017, Autonomous Agents and Multi-Agent Systems.
[35] Kian Hsiang Low,et al. Inverse Reinforcement Learning with Locally Consistent Reward Functions , 2015, NIPS.
[36] Sergey Levine,et al. Nonlinear Inverse Reinforcement Learning with Gaussian Processes , 2011, NIPS.
[37] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[38] Michael L. Littman,et al. Apprenticeship Learning About Multiple Intentions , 2011, ICML.
[39] Kee-Eung Kim,et al. Nonparametric Bayesian Inverse Reinforcement Learning for Multiple Reward Functions , 2012, NIPS.
[40] Leslie Pack Kaelbling,et al. On the Complexity of Solving Markov Decision Problems , 1995, UAI.
[41] Peter I. Frazier,et al. Distance dependent Chinese restaurant processes , 2009, ICML.
[42] Shalabh Bhatnagar,et al. Natural actor-critic algorithms , 2009, Autom..
[43] Alicia P. Wolfe,et al. Identifying useful subgoals in reinforcement learning by local graph partitioning , 2005, ICML.
[44] Mostafa Al-Emran,et al. Hierarchical Reinforcement Learning: A Survey , 2015 .
[45] D. Aldous. Exchangeability and related topics , 1985 .
[46] Jonathan P. How,et al. Bayesian Nonparametric Reward Learning From Demonstration , 2015, IEEE Transactions on Robotics.
[47] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[48] Chong-Wah Ngo,et al. Evaluating bag-of-visual-words representations in scene classification , 2007, MIR '07.
[49] Peter Stone,et al. Autonomous agents modelling other agents: A comprehensive survey and open problems , 2017, Artif. Intell..
[50] Jonathan P. How,et al. Scalable reward learning from demonstration , 2013, 2013 IEEE International Conference on Robotics and Automation.
[51] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[52] Kenneth Dixon,et al. Introduction to Stochastic Modeling , 2011 .
[53] M. Botvinick. Hierarchical reinforcement learning and decision making , 2012, Current Opinion in Neurobiology.
[54] Burr Settles,et al. Active Learning Literature Survey , 2009 .
[55] Nir Friedman,et al. Probabilistic Graphical Models - Principles and Techniques , 2009 .
[56] Scott Kuindersma,et al. Robot learning from demonstration by constructing skill trees , 2012, Int. J. Robotics Res..
[57] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..