An adaptive neural net controller with visual inputs

It was demonstrated in 1964 by F.W. Smith and B. Widrow that a neural network used as an adaptive controller could be trained to stabilize an inverted pendulum fixed to a non-stationary platform or cart (the "broom-balancer"). The critical state variables (pendulum angle, pendulmn angular velocity, cart position, cart velocity) were measured and fed to the system in digital form. An unrealized objective in 1964 was to use visual inputs to indirectly obtain the state variables for purposes of control. This study, begun in 1987, has finally achieved that objective. The present scheme utilizes two dynamic images (5 by 11 binary pixels each) of the pendulum and its supporting cart. The first image represents the state of the pendulum and cart at the current time and the second image represents their state at a fixed time spacing earlier. These images are used as inputs to the neural network. An sample 5 by 11 image is shown in Hgure 1. Ilil imll Figure t. A skilled human trains the neural network while controlling the pendulum. The cart's propulsion is full speed forward or full speed reverse, as required for stabilization. "Bang-bang" control decisions are made by the human, which are made available to the neural network. The neural network learns to associate the dynamic visual images of the pendulum and cart with the human's control decisions. The network effectively learns to find the critical state variables from the visual images and learns how to use the visual information for control purposes, Two different controllers were tested for this work: a single adaptive linear threshold element (ADALINE) trained with the Widrow-Hoff LMS algorithm, and a two layer feedforward network with 3 hidden units trained with the backpropagation algorithm. Both controllers were able to stabilize the pendulum. Generalization was demonstrated in that successful stabilization occurred after training on only a limited number of all possible state images. This project illustrates the concept of a "trainable expert system." The skill of a human was embedded in the neural network controller. The adaptive system was able to "look over a human's shoulder" and learn to mimic human decisions. This work foreshadows the development of a new kind of machine, the ADAPTIVE COMPUTER, that is trained rather than programmed to perform useful tasks.