Behavioral building blocks for autonomous agents: description, identification, and learning
暂无分享,去创建一个
[1] R. Bellman,et al. Dynamic Programming and Markov Processes , 1960 .
[2] Saul Amarel,et al. On representations of problems of reasoning about actions , 1968 .
[3] David G. Stork,et al. Pattern Classification , 1973 .
[4] Laurent Siklóssy,et al. The Role of Preprocessing in Problem Solving Systems , 1977, IJCAI.
[5] Leonard M. Freeman,et al. A set of measures of centrality based upon betweenness , 1977 .
[6] L. Freeman. Centrality in social networks conceptual clarification , 1978 .
[7] Richard E. Korf,et al. Macro-Operators: A Weak Method for Learning , 1985, Artif. Intell..
[8] Rodney A. Brooks,et al. A Robust Layered Control Syste For A Mobile Robot , 2022 .
[9] John Scott. What is social network analysis , 2010 .
[10] C. Watkins. Learning from delayed rewards , 1989 .
[11] Glenn A. Iba,et al. A heuristic approach to the discovery of macro-operators , 2004, Machine Learning.
[12] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[13] Stewart W. Wilson,et al. A Possibility for Implementing Curiosity and Boredom in Model-Building Neural Controllers , 1991 .
[14] Andrew B. Kahng,et al. New spectral methods for ratio cut partitioning and clustering , 1991, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..
[15] Sebastian Thrun,et al. Efficient Exploration In Reinforcement Learning , 1992 .
[16] C. Atkeson,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.
[17] Richard M. Leahy,et al. An Optimal Graph Theoretic Approach to Data Clustering: Theory and Its Application to Image Segmentation , 1993, IEEE Trans. Pattern Anal. Mach. Intell..
[18] J. Peng,et al. Efficient Learning and Planning Within the Dyna Framework , 1993, IEEE International Conference on Neural Networks.
[19] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[20] Michael O. Duff,et al. Reinforcement Learning Methods for Continuous-Time Markov Decision Problems , 1994, NIPS.
[21] Sebastian Thrun,et al. Finding Structure in Reinforcement Learning , 1994, NIPS.
[22] S. Wasserman,et al. Social Network Analysis: Computer Programs , 1994 .
[23] S. Hochreiter,et al. REINFORCEMENT DRIVEN INFORMATION ACQUISITION IN NONDETERMINISTIC ENVIRONMENTS , 1995 .
[24] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[25] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[26] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[27] Richard S. Sutton,et al. Roles of Macro-Actions in Accelerating Reinforcement Learning , 1998 .
[28] Jitendra Malik,et al. Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[29] Ronald E. Parr,et al. Hierarchical control and learning for markov decision processes , 1998 .
[30] Bruce L. Digney,et al. Learning hierarchical control structures for multiple tasks and changing environments , 1998 .
[31] Daniel E. Koditschek,et al. Sequential Composition of Dynamically Dexterous Robot Behaviors , 1999, Int. J. Robotics Res..
[32] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[33] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[34] Doina Precup,et al. Temporal abstraction in reinforcement learning , 2000, ICML 2000.
[35] Peter Dayan,et al. Dopamine Bonuses , 2000, NIPS.
[36] Andrew G. Barto,et al. Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.
[37] Shigeo Abe DrEng. Pattern Classification , 2001, Springer London.
[38] James L. McClelland,et al. Autonomous Mental Development by Robots and Animals , 2001, Science.
[39] U. Brandes. A faster algorithm for betweenness centrality , 2001 .
[40] Bernhard Hengst,et al. Discovering Hierarchy in Reinforcement Learning with HEXQ , 2002, ICML.
[41] Andrew G. Barto,et al. PolicyBlocks: An Algorithm for Creating Useful Macro-Actions in Reinforcement Learning , 2002, ICML.
[42] Amy McGovern. Autonomous Discovery of Abstractions through Interaction with an Environment , 2002, SARA.
[43] Doina Precup,et al. Learning Options in Reinforcement Learning , 2002, SARA.
[44] Andrew G. Barto,et al. Optimal learning: computational procedures for bayes-adaptive markov decision processes , 2002 .
[45] Shie Mannor,et al. Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning , 2002, ECML.
[46] Pierre-Yves Oudeyer,et al. Motivational principles for visual know-how development , 2003 .
[47] Michael O. Duff,et al. Design for an Optimal Probe , 2003, ICML.
[48] AUTOMATED DISCOVERY OF OPTIONS IN REINFORCEMENT LEARNING , 2003 .
[49] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.
[50] Shie Mannor,et al. Dynamic abstraction in reinforcement learning via clustering , 2004, ICML.
[51] Nuttapong Chentanez,et al. Intrinsically Motivated Learning of Hierarchical Collections of Skills , 2004 .
[52] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.
[53] Andrew W. Moore,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.
[54] Andrew G. Barto,et al. Using relative novelty to identify useful temporal abstractions in reinforcement learning , 2004, ICML.
[55] Alicia P. Wolfe,et al. Identifying useful subgoals in reinforcement learning by local graph partitioning , 2005, ICML.
[56] Andrew G. Barto,et al. A causal approach to hierarchical decomposition of factored MDPs , 2005, ICML.
[57] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[58] Sridhar Mahadevan,et al. Proto-value functions: developmental reinforcement learning , 2005, ICML.
[59] A. Barto,et al. Intrinsic Motivation For Reinforcement Learning Systems , 2005 .
[60] Andrew G. Barto,et al. An intrinsic reward mechanism for efficient exploration , 2006, ICML.
[61] Andrea Bonarini,et al. Self-Development Framework for Reinforcement Learning Agents , 2006 .
[62] Leslie Pack Kaelbling,et al. Learning Hierarchical Structure in Policies , 2007, NIPS 2007.
[63] Vimal Mathew,et al. Automated Spatio-Temporal Abstraction in Reinforcement Learning , 2007 .
[64] Sridhar Mahadevan,et al. Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes , 2007, J. Mach. Learn. Res..
[65] Andrew G. Barto,et al. Skill Characterization Based on Betweenness , 2008, NIPS.
[66] Thomas G. Dietterich,et al. Automatic discovery and transfer of MAXQ hierarchies , 2008, ICML '08.
[67] U. Rieder,et al. Markov Decision Processes , 2010 .