The Development of the Time-Delayed Neural Network Architecture

Abstract : Currently, one of the most powerful connectionist learning procedures is back-propagation which repeatedly adjusts the weights in a network so as to minimize a measure of the difference between the actual output vector of the network and a desired output vector given the current input vector. The simple weight adjusting rule is derived by propagating partial derivatives of the error backwards through the net using the chain rule. Experiments have shown that back-propagation has most of the properties desired by connectionists. As with any worthwhile learning rule, it can learn non-linear black box functions and make fine distinctions between input patterns in the presence of noise. Moreover, starting from random initial states, back-propagation networks can learn to use their hidden (intermediate layer) units to efficiently represent the structure that is inherent in their input data, often discovering intuitively pleasing features. The fact that back-propagation can discover features and distinguish between similar patterns in the presence of noise makes it a natural candidate as a speech recognition method. Another reason for expecting back-propagation to be good at speech is the success that hidden Markov models have enjoyed in speech can be useful when there is a rigorous automatic method for tuning its parameters.