Structure in the Space of Value Functions
暂无分享,去创建一个
[1] Richard Fikes,et al. Learning and Executing Generalized Robot Plans , 1993, Artif. Intell..
[2] R. Cox,et al. Journal of the Royal Statistical Society B , 1972 .
[3] Austin Tate,et al. Generating Project Networks , 1977, IJCAI.
[4] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[5] P. Varaiya,et al. Multilayer control of large Markov chains , 1978 .
[6] Edward H. Adelson,et al. The Laplacian Pyramid as a Compact Image Code , 1983, IEEE Trans. Commun..
[7] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[8] Richard E. Korf,et al. Macro-Operators: A Weak Method for Learning , 1985, Artif. Intell..
[9] C. Watkins. Learning from delayed rewards , 1989 .
[10] Richard S. Sutton,et al. Learning and Sequential Decision Making , 1989 .
[11] D. N. Geary. Mixture Models: Inference and Applications to Clustering , 1989 .
[12] M. Gabriel,et al. Learning and Computational Neuroscience: Foundations of Adaptive Networks , 1990 .
[13] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[14] Austin Tate,et al. O-Plan: The open Planning Architecture , 1991, Artif. Intell..
[15] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[16] Satinder P. Singh,et al. Reinforcement Learning with a Hierarchy of Abstract Models , 1992, AAAI.
[17] Geoffrey E. Hinton,et al. Autoencoders, Minimum Description Length and Helmholtz Free Energy , 1993, NIPS.
[18] Laurent D. Cohen,et al. Finite-Element Methods for Active Contour Models and Balloons for 2-D and 3-D Images , 1993, IEEE Trans. Pattern Anal. Mach. Intell..
[19] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .
[20] Sebastian Thrun,et al. Finding Structure in Reinforcement Learning , 1994, NIPS.
[21] Michael I. Jordan,et al. Reinforcement Learning with Soft State Aggregation , 1994, NIPS.
[22] Richard S. Sutton,et al. TD Models: Modeling the World at a Mixture of Time Scales , 1995, ICML.
[23] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[24] Andrew G. Barto,et al. Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.
[25] Geoffrey J. Gordon. Stable Fitted Reinforcement Learning , 1995, NIPS.
[26] Krzysztof J. Cios,et al. Advances in neural information processing systems 7 , 1997 .
[27] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[28] Doina Precup,et al. Multi-time Models for Temporally Abstract Planning , 1997, NIPS.
[29] Geoffrey E. Hinton,et al. Generative models for discovering sparse distributed representations. , 1997, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.
[30] Manfred K. Warmuth,et al. Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..
[31] Brendan J. Frey,et al. Efficient Stochastic Source Coding and an Application to a Bayesian Network Source Model , 1997, Comput. J..
[32] Milos Hauskrecht,et al. Hierarchical Solution of Markov Decision Processes using Macro-actions , 1998, UAI.
[33] Doina Precup,et al. Theoretical Results on Reinforcement Learning with Temporally Abstract Options , 1998, ECML.
[34] Ronald E. Parr,et al. Hierarchical control and learning for markov decision processes , 1998 .
[35] Doina Precup,et al. Between MOPs and Semi-MOP: Learning, Planning & Representing Knowledge at Multiple Temporal Scales , 1998 .
[36] Jorma Rissanen,et al. Stochastic Complexity in Statistical Inquiry , 1989, World Scientific Series in Computer Science.
[37] Thomas G. Dietterich. The MAXQ Method for Hierarchical Reinforcement Learning , 1998, ICML.
[38] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[39] Chris Drummond,et al. Composing Functions to Speed up Reinforcement Learning in a Changing World , 1998, ECML.
[40] Craig Boutilier,et al. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..
[41] Peter Sollich,et al. Advances in neural information processing systems 11 , 1999 .
[42] Andrew W. Moore,et al. Multi-Value-Functions: Efficient Automatic Action Hierarchies for Multiple Goal MDPs , 1999, IJCAI.
[43] Zoubin Ghahramani,et al. Variational Inference for Bayesian Mixtures of Factor Analysers , 1999, NIPS.
[44] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[45] Thomas G. Dietterich,et al. Editors. Advances in Neural Information Processing Systems , 2002 .
[46] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.