Consider a robot wandering around an unfamiliar environment, performing actions and sensing the resulting environmental states. The robot's task is to construct an internal model of its environment, a model that will allow it to predict the consequences of its actions and to determine what sequences of actions to take to reach particular goal states. Rivest and Schapire (1987a, 1987bs Schapire, 1988) have studied this problem and have designed a symbolic algorithm to strategically explore and infer the structure of “finite state” environments. The heart of this algorithm is a clever representation of the environment called an update graph. We have developed a connectionist implementation of the update graph using a highly-specialized network architecture. With back propagation learning and a trivial exploration strategy—choosing random actions—the network can outperform the Rivest and Schapire algorithm on simple problems. Perhaps the most interesting consequence of the connectionist approach is that, by relaxing the constraints imposed by a symbolic description, it suggests a more general representation of the update graph, thus allowing for greater flexibility in expressing potential solutions.
[1]
David A. Cohn,et al.
Training Connectionist Networks with Queries and Selective Sampling
,
1989,
NIPS.
[2]
Ronald L. Rivest,et al.
Diversity-based inference of finite automata
,
1994,
28th Annual Symposium on Foundations of Computer Science (sfcs 1987).
[3]
Robert E. Schapire,et al.
A new approach to unsupervised learning in deterministic environments
,
1990
.
[4]
James L. McClelland,et al.
Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations
,
1986
.
[5]
Geoffrey E. Hinton,et al.
Learning internal representations by error propagation
,
1986
.
[6]
James L. McClelland,et al.
Learning Subsequential Structure in Simple Recurrent Networks
,
1988,
NIPS.
[7]
Jeffrey L. Elman,et al.
Finding Structure in Time
,
1990,
Cogn. Sci..
[8]
Michael C. Mozer,et al.
A Focused Backpropagation Algorithm for Temporal Pattern Recognition
,
1989,
Complex Syst..