A Reinforcement Connectionist Approach to Robot Path Finding in Non-Maze-Like Environments

This paper presents a reinforcement connectionist system which finds and learns the suitable situation-action rules so as to generate feasible paths for a point robot in a 2D environment with circular obstacles. The basic reinforcement algorithm is extended with a strategy for discovering stable solution paths. Equipped with this strategy and a powerful codification scheme, the path-finder (i) learns quickly, (ii) deals with continuous-valued inputs and outputs, (iii) exhibits good noise-tolerance and generalization capabilities, (iv) copes with dynamic environments, and (v) solves an instance of the path finding problem with strong performance demands.

[1]  Tomás Lozano-Pérez,et al.  An algorithm for planning collision-free paths among polyhedral obstacles , 1979, CACM.

[2]  Richard S. Sutton,et al.  Temporal credit assignment in reinforcement learning , 1984 .

[3]  P. Anandan,et al.  Pattern-recognizing stochastic learning automata , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[4]  Sue Whitesides,et al.  Computational Geometry and Motion Planning , 1985 .

[5]  A G Barto,et al.  Learning by statistical cooperation of self-interested neuron-like computing elements. , 1985, Human neurobiology.

[6]  Pat Langley,et al.  Learning to search : from weak methods to domain-specific heuristics , 1985 .

[7]  O. Khatib,et al.  Real-Time Obstacle Avoidance for Manipulators and Mobile Robots , 1985, Proceedings. 1985 IEEE International Conference on Robotics and Automation.

[8]  Charles W. Anderson,et al.  Learning and problem-solving with multilayer connectionist systems (adaptive, strategy learning, neural networks, reinforcement learning) , 1986 .

[9]  Rodney A. Brooks,et al.  A Robust Layered Control Syste For A Mobile Robot , 2022 .

[10]  Paul J. Werbos,et al.  Building and Understanding Adaptive Systems: A Statistical/Numerical Approach to Factory Automation and Brain Research , 1987, IEEE Transactions on Systems, Man, and Cybernetics.

[11]  Chee Yap,et al.  Algorithmic motion planning , 1987 .

[12]  Charles W. Anderson,et al.  Strategy Learning with Multilayer Connectionist Representations , 1987 .

[13]  Ronald C. Arkin,et al.  Motor schema based navigation for a mobile robot: An approach to programming by behavior , 1987, Proceedings. 1987 IEEE International Conference on Robotics and Automation.

[14]  Marcel Schoppers,et al.  Universal Plans for Reactive Robots in Unpredictable Environments , 1987, IJCAI.

[15]  Bruce Randall Donald,et al.  A Search Algorithm for Motion Planning with Six Degrees of Freedom , 1987, Artif. Intell..

[16]  David Chapman,et al.  Pengi: An Implementation of a Theory of Activity , 1987, AAAI.

[17]  John Canny,et al.  The complexity of robot motion planning , 1988 .

[18]  V. Gullapalli A Stochastic Algorithm for Learning Real-valued Functions via Reinforcement , 1988 .

[19]  Luc Steels Steps towards Common Sense , 1988, ECAI.

[20]  D. H. Graf,et al.  A neural controller for collision-free movement of general robot manipulators , 1988, IEEE 1988 International Conference on Neural Networks.

[21]  Marco Saerens,et al.  A neural controller , 1989 .

[22]  Tom M. Mitchell,et al.  On Becoming Reactive , 1989, ML.

[23]  Richard S. Sutton,et al.  Learning and Sequential Decision Making , 1989 .

[24]  Michael I. Jordan,et al.  Learning to Control an Unstable System with Forward Modeling , 1989, NIPS.

[25]  José del R. Millán,et al.  Reinforcement Learning: Discovering Stable Solutions in the Robot Path Finding Domain , 1990, ECAI.

[26]  Michael C. Mozer,et al.  Discovering the Structure of a Reactive Environment by Exploration , 1990, Neural Computation.

[27]  Bartlett W. Mel,et al.  Murphy: A neurally-inspired connectionist approach to learning and performance in vision-based robot motion planning , 1990 .

[28]  Robert E. Schapire,et al.  A new approach to unsupervised learning in deterministic environments , 1990 .

[29]  Richard S. Sutton,et al.  Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[30]  Tomás Lozano-Pérez,et al.  Spatial Planning: A Configuration Space Approach , 1983, IEEE Transactions on Computers.

[31]  Carme Torras,et al.  2D Path Planning: A Configuration Space Heuristic Approach , 1990, Int. J. Robotics Res..

[32]  Carme Torras Motion planning and control: symbolic and neural levels of computation , 1991 .

[33]  José del R. Millán,et al.  Learning to Avoid Obstacles Through Reinforcement , 1991, ML.

[34]  Satinder P. Singh,et al.  Transfer of Learning Across Compositions of Sequentail Tasks , 1991, ML.

[35]  Long-Ji Lin,et al.  Self-improving reactive agents: case studies of reinforcement learning frameworks , 1991 .

[36]  Sridhar Mahadevan,et al.  Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..