论文信息 - A Reinforcement Connectionist Approach to Robot Path Finding in Non-Maze-Like Environments - 字舞流文

A Reinforcement Connectionist Approach to Robot Path Finding in Non-Maze-Like Environments

This paper presents a reinforcement connectionist system which finds and learns the suitable situation-action rules so as to generate feasible paths for a point robot in a 2D environment with circular obstacles. The basic reinforcement algorithm is extended with a strategy for discovering stable solution paths. Equipped with this strategy and a powerful codification scheme, the path-finder (i) learns quickly, (ii) deals with continuous-valued inputs and outputs, (iii) exhibits good noise-tolerance and generalization capabilities, (iv) copes with dynamic environments, and (v) solves an instance of the path finding problem with strong performance demands.

J. Millán | C. Torras

[1] Tomás Lozano-Pérez,et al. An algorithm for planning collision-free paths among polyhedral obstacles , 1979, CACM.

[2] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .

[3] P. Anandan,et al. Pattern-recognizing stochastic learning automata , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[4] Sue Whitesides,et al. Computational Geometry and Motion Planning , 1985 .

[5] A G Barto,et al. Learning by statistical cooperation of self-interested neuron-like computing elements. , 1985, Human neurobiology.

[6] Pat Langley,et al. Learning to search : from weak methods to domain-specific heuristics , 1985 .

[7] O. Khatib,et al. Real-Time Obstacle Avoidance for Manipulators and Mobile Robots , 1985, Proceedings. 1985 IEEE International Conference on Robotics and Automation.

[8] Charles W. Anderson,et al. Learning and problem-solving with multilayer connectionist systems (adaptive, strategy learning, neural networks, reinforcement learning) , 1986 .

[9] Rodney A. Brooks,et al. A Robust Layered Control Syste For A Mobile Robot , 2022 .

[10] Paul J. Werbos,et al. Building and Understanding Adaptive Systems: A Statistical/Numerical Approach to Factory Automation and Brain Research , 1987, IEEE Transactions on Systems, Man, and Cybernetics.

[11] Chee Yap,et al. Algorithmic motion planning , 1987 .

[12] Charles W. Anderson,et al. Strategy Learning with Multilayer Connectionist Representations , 1987 .

[13] Ronald C. Arkin,et al. Motor schema based navigation for a mobile robot: An approach to programming by behavior , 1987, Proceedings. 1987 IEEE International Conference on Robotics and Automation.

[14] Marcel Schoppers,et al. Universal Plans for Reactive Robots in Unpredictable Environments , 1987, IJCAI.

[15] Bruce Randall Donald,et al. A Search Algorithm for Motion Planning with Six Degrees of Freedom , 1987, Artif. Intell..

[16] David Chapman,et al. Pengi: An Implementation of a Theory of Activity , 1987, AAAI.

[17] John Canny,et al. The complexity of robot motion planning , 1988 .

[18] V. Gullapalli. A Stochastic Algorithm for Learning Real-valued Functions via Reinforcement , 1988 .

[19] Luc Steels. Steps towards Common Sense , 1988, ECAI.

[20] D. H. Graf,et al. A neural controller for collision-free movement of general robot manipulators , 1988, IEEE 1988 International Conference on Neural Networks.

[21] Marco Saerens,et al. A neural controller , 1989 .

[22] Tom M. Mitchell,et al. On Becoming Reactive , 1989, ML.

[23] Richard S. Sutton,et al. Learning and Sequential Decision Making , 1989 .

[24] Michael I. Jordan,et al. Learning to Control an Unstable System with Forward Modeling , 1989, NIPS.

[25] José del R. Millán,et al. Reinforcement Learning: Discovering Stable Solutions in the Robot Path Finding Domain , 1990, ECAI.

[26] Michael C. Mozer,et al. Discovering the Structure of a Reactive Environment by Exploration , 1990, Neural Computation.

[27] Bartlett W. Mel,et al. Murphy: A neurally-inspired connectionist approach to learning and performance in vision-based robot motion planning , 1990 .

[28] Robert E. Schapire,et al. A new approach to unsupervised learning in deterministic environments , 1990 .

[29] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[30] Tomás Lozano-Pérez,et al. Spatial Planning: A Configuration Space Approach , 1983, IEEE Transactions on Computers.

[31] Carme Torras,et al. 2D Path Planning: A Configuration Space Heuristic Approach , 1990, Int. J. Robotics Res..

[32] Carme Torras. Motion planning and control: symbolic and neural levels of computation , 1991 .

[33] José del R. Millán,et al. Learning to Avoid Obstacles Through Reinforcement , 1991, ML.

[34] Satinder P. Singh,et al. Transfer of Learning Across Compositions of Sequentail Tasks , 1991, ML.

[35] Long-Ji Lin,et al. Self-improving reactive agents: case studies of reinforcement learning frameworks , 1991 .

[36] Sridhar Mahadevan,et al. Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..