Optimal terminal control using feedforward neural networks

Multi-layer, sigmoidal neural networks are capable of approximating continuous, multi-variate functions and, as such, can implement nonlinear state-feedback controllers. The most common algorithm for training such controllers is backpropagation through time (BPTT). However, BPTT does not deal with terminal control problems in which the cost function includes the elapsed trajectory-time. This difficulty is investigated in the first half of the thesis. The controller design is reformulated as an optimization problem defined over the entire field of extremals with the set of trajectory times incorporated into this cost function. Necessary first-order stationary conditions are derived which correspond to standard BPTT with the addition of certain transversality conditions. The new gradient algorithm based on these conditions is called time-optimal backpropagation through time (TOBPTT). This approach is tested on several open final-time problems with very good results. In the second half of the thesis, an extension to TOBPTT is proposed for dealing with the problem of navigation and obstacle avoidance. This method separates the two tasks of "navigation" and "control" by first using an array of interconnected processors to rapidly compute a navigational field associated with the current obstacle configuration. The algorithm which accomplishes this is called the cumulative barrier method. It guarantees the absence of local minima and ensures that gradient-induced paths steer clear of obstacle surfaces. Then, a multi-layer network is trained to steer the vehicle along the negative gradient of the resulting field. No retraining is necessary when obstacles are moved; rather, the array simply tracks the obstacle movements and provides the new gradient information to the controller. A variant of the truck-backer is used with good results in testing this method.