Reinforcement learning for robots using neural networks
暂无分享,去创建一个
[1] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..
[2] R. Bellman,et al. Dynamic Programming and Markov Processes , 1960 .
[3] Tom M. Mitchell,et al. Generalization as Search , 1982, Artif. Intell..
[4] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[5] Hans P. Moravec,et al. High resolution maps from wide angle sonar , 1985, Proceedings. 1985 IEEE International Conference on Robotics and Automation.
[6] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[7] John H. Holland,et al. Escaping brittleness: the possibilities of general-purpose learning algorithms applied to parallel rule-based systems , 1995 .
[8] Mozer,et al. RAMBOT (Restructuring Associative Memory Based on Training): a connectionist expert system that learns by example. Technical report, October 1985-April 1986 , 1986 .
[9] Bernardo A. Huberman,et al. AN IMPROVED THREE LAYER, BACK PROPAGATION ALGORITHM , 1987 .
[10] Ronald L. Rivest,et al. Diversity-based inference of finite automata , 1994, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).
[11] Paul J. Werbos,et al. Building and Understanding Adaptive Systems: A Statistical/Numerical Approach to Factory Automation and Brain Research , 1987, IEEE Transactions on Systems, Man, and Cybernetics.
[12] Richard E. Korf,et al. Planning as Search: A Quantitative Approach , 1987, Artif. Intell..
[13] Charles W. Anderson,et al. Strategy Learning with Multilayer Connectionist Representations , 1987 .
[14] PAUL J. WERBOS,et al. Generalization of backpropagation with application to a recurrent gas market model , 1988, Neural Networks.
[15] Bernard Widrow,et al. Adaptive switching circuits , 1988 .
[16] Scott E. Fahlman,et al. An empirical study of learning speed in back-propagation networks , 1988 .
[17] Dean Pomerleau,et al. ALVINN: An Autonomous Land Vehicle in a Neural Network , 1988, NIPS.
[18] Reid Simmons,et al. Experience with a Task Control Architecture for Mobile Robots , 1989 .
[19] D. Ballard,et al. A Role for Anticipation in Reactive Systems that Learn , 1989, ML.
[20] C. Watkins. Learning from delayed rewards , 1989 .
[21] Richard S. Sutton,et al. Learning and Sequential Decision Making , 1989 .
[22] Christian Lebiere,et al. The Cascade-Correlation Learning Architecture , 1989, NIPS.
[23] Michael I. Jordan,et al. Learning to Control an Unstable System with Forward Modeling , 1989, NIPS.
[24] Ronald J. Williams,et al. A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.
[25] Alexander H. Waibel,et al. Modular Construction of Time-Delay Neural Networks for Speech Recognition , 1989, Neural Computation.
[26] C. Atkeson. Learning arm kinematics and dynamics. , 1989, Annual review of neuroscience.
[27] Jürgen Schmidhuber,et al. Reinforcement Learning in Markovian and Non-Markovian Environments , 1990, NIPS.
[28] Rodney A. Brooks,et al. Learning to Coordinate Behaviors , 1990, AAAI.
[29] Geoffrey E. Hinton,et al. Distributed Representations , 1986, The Philosophy of Artificial Intelligence.
[30] Sebastian Thrun,et al. Planning with an Adaptive World Model , 1990, NIPS.
[31] Alexander H. Waibel,et al. The Tempo 2 Algorithm: Adjusting Time-Delays By Supervised Learning , 1990, NIPS.
[32] Tom M. Mitchell,et al. Becoming Increasingly Reactive , 1990, AAAI.
[33] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..
[34] David Chapman,et al. Vision, instruction, and action , 1990 .
[35] Andrew W. Moore,et al. Efficient memory-based learning for robot control , 1990 .
[36] Scott E. Fahlman,et al. The Recurrent Cascade-Correlation Architecture , 1990, NIPS.
[37] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[38] Ming Tan,et al. Learning a Cost-Sensitive Internal Representation for Reinforcement Learning , 1991, ML.
[39] Michael C. Mozer,et al. SLUG: A Connectionist Architecture for Inferring the Structure of Finite-State Environments , 1991, Mach. Learn..
[40] Leslie Pack Kaelbling,et al. Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons , 1991, IJCAI.
[41] Sridhar Mahadevan,et al. Scaling Reinforcement Learning to Robotics by Exploiting the Subsumption Architecture , 1991, ML Workshop.
[42] Christopher G. Atkeson,et al. Using locally weighted regression for robot learning , 1991, Proceedings. 1991 IEEE International Conference on Robotics and Automation.
[43] Steven D. Whitehead,et al. Complexity and Cooperation in Q-Learning , 1991, ML.
[44] Anders Krogh,et al. Introduction to the theory of neural computation , 1994, The advanced book program.
[45] Long Ji Lin,et al. Programming Robots Using Reinforcement Learning and Teaching , 1991, AAAI.
[46] Long-Ji Lin,et al. Self-improving reactive agents: case studies of reinforcement learning frameworks , 1991 .
[47] Craig A. Knoblock. Automatically generating abstractions for problem solving , 1991 .
[48] Ming Tan,et al. Cost-sensitive robot learning , 1991 .
[49] Satinder Singh. Transfer of learning by composing solutions of elemental sequential tasks , 2004, Machine Learning.
[50] Alan D. Christiansen,et al. Automatic acquisition of task theories for robotic manipulation , 1992 .
[51] J. Millán,et al. A Reinforcement Connectionist Approach to Robot Path Finding in Non-Maze-Like Environments , 2004, Machine Learning.
[52] Satinder P. Singh,et al. Scaling Reinforcement Learning Algorithms by Learning Variable Temporal Resolution Models , 1992, ML Workshop.
[53] Yolanda Gil,et al. Acquiring domain knowledge for planning by experimentation , 1992 .
[54] G. Tesauro. Practical Issues in Temporal Difference Learning , 1992 .
[55] Andrew H. Fagg,et al. Genetic programming approach to the construction of a neural network for control of a walking robot , 1992, Proceedings 1992 IEEE International Conference on Robotics and Automation.
[56] Lonnie Chrisman,et al. Reinforcement Learning with Perceptual Aliasing: The Perceptual Distinctions Approach , 1992, AAAI.
[57] Ajay Naresh Jain,et al. Parsec: a connectionist learning architecture for parsing spoken language , 1992 .
[58] Satinder P. Singh,et al. Reinforcement Learning with a Hierarchy of Abstract Models , 1992, AAAI.
[59] Andrew McCallum,et al. Using Transitional Proximity for Faster Reinforcement Learning , 1992, ML.
[60] Sebastian Thrun,et al. Efficient Exploration In Reinforcement Learning , 1992 .
[61] Sebastian Thrun,et al. Explanation-Based Neural Network Learning for Robot Control , 1992, NIPS.
[62] Long Lin,et al. Memory Approaches to Reinforcement Learning in Non-Markovian Domains , 1992 .
[63] Leslie Pack Kaelbling,et al. Learning in embedded systems , 1993 .
[64] Marco Dorigo,et al. Genetics-based machine learning and behavior-based robotics: a new synthesis , 1993, IEEE Trans. Syst. Man Cybern..
[65] Dean A. Pomerleau,et al. Neural Network Perception for Mobile Robot Guidance , 1993 .
[66] J. Peng,et al. Efficient Learning and Planning Within the Dyna Framework , 1993, IEEE International Conference on Neural Networks.