暂无分享,去创建一个
[1] Bernard Harris,et al. Graph theory and its applications , 1970 .
[2] Audra E. Kosh,et al. Linear Algebra and its Applications , 1992 .
[3] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[4] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[5] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[6] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[7] Doina Precup,et al. Temporal abstraction in reinforcement learning , 2000, ICML 2000.
[8] Andrew G. Barto,et al. Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.
[9] Bernhard Hengst,et al. Discovering Hierarchy in Reinforcement Learning with HEXQ , 2002, ICML.
[10] Terrence J. Sejnowski,et al. Slow Feature Analysis: Unsupervised Learning of Invariances , 2002, Neural Computation.
[11] Shie Mannor,et al. Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning , 2002, ECML.
[12] Marcus Weber,et al. Perron Cluster Analysis and Its Connection to Graph Partitioning for Noisy Data , 2004 .
[13] Shie Mannor,et al. Dynamic abstraction in reinforcement learning via clustering , 2004, ICML.
[14] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[15] Andrew G. Barto,et al. Using relative novelty to identify useful temporal abstractions in reinforcement learning , 2004, ICML.
[16] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[17] Alicia P. Wolfe,et al. Identifying useful subgoals in reinforcement learning by local graph partitioning , 2005, ICML.
[18] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[19] Sridhar Mahadevan,et al. Proto-value functions: developmental reinforcement learning , 2005, ICML.
[20] Sridhar Mahadevan,et al. Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes , 2007, J. Mach. Learn. Res..
[21] Andrew G. Barto,et al. Skill Characterization Based on Betweenness , 2008, NIPS.
[22] Andrew G. Barto,et al. Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining , 2009, NIPS.
[23] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.
[24] Jürgen Schmidhuber,et al. Incremental Slow Feature Analysis , 2011, IJCAI.
[25] Henning Sprekeler,et al. On the Relation of Slow Feature Analysis and Laplacian Eigenmaps , 2011, Neural Computation.
[26] Pierre-Yves Oudeyer,et al. Exploration strategies in developmental robotics: A unified probabilistic framework , 2013, 2013 IEEE Third Joint International Conference on Development and Learning and Epigenetic Robotics (ICDL).
[27] Pierre-Yves Oudeyer,et al. Active learning of inverse models with intrinsically motivated goal exploration in robots , 2013, Robotics Auton. Syst..
[28] Pierre-Luc Bacon. On the Bottleneck Concept for Options Discovery: Theoretical Underpinnings and Extension in Continuous State Spaces , 2013 .
[29] Christoph Salge,et al. Empowerment - an Introduction , 2013, ArXiv.
[30] Alec Solway,et al. Optimal Behavioral Hierarchy , 2014, PLoS Comput. Biol..
[31] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.
[32] Benjamin Van Roy,et al. Generalization and Exploration via Randomized Value Functions , 2014, ICML.
[33] Marlos C. Machado,et al. Learning Purposeful Behaviour in the Absence of Rewards , 2016, ArXiv.
[34] Alex Graves,et al. Strategic Attentive Writer for Learning Macro-Actions , 2016, NIPS.
[35] Honglak Lee,et al. Control of Memory, Active Perception, and Action in Minecraft , 2016, ICML.
[36] Jan Peters,et al. Probabilistic inference for determining options in reinforcement learning , 2016, Machine Learning.
[37] Balaraman Ravindran,et al. Option Discovery in Hierarchical Reinforcement Learning using Spatio-Temporal Clustering , 2016, 1605.05359.
[38] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[39] Shie Mannor,et al. Adaptive Skills Adaptive Partitions (ASAP) , 2016, NIPS.
[40] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[41] Marijn F. Stollenga,et al. Continual curiosity-driven skill acquisition from high-dimensional video inputs for humanoid robots , 2017, Artif. Intell..