论文信息 - SPEECH RECOGNITION FOR THE TIMIT DATABASE USING NEURAL NETWORKS

SPEECH RECOGNITION FOR THE TIMIT DATABASE USING NEURAL NETWORKS

Four types of neural networks have previously been established for speech recognition by the authors and tested on a small, 7 speaker, 100 sentence database. This paper describes their application to the TIMIT database. The networks are: a recurrent network phoneme recogniser, a modified Kanerva model morph recogniser, a compositional representation phoneme-toword recogniser and a modified Kanerva model morph-to-word recogniser. The major result is for the recurrent net, giving a phoneme recognition accuracy of 57% from the sz and sz sentences. The Kanerva morph recognizer achieves 66.2% accuracy for a small subset of the sa and sz sentences. The results for the word recognizers are incomplete.

[1] Richard W. Prager,et al. The modified Kanerva model for automatic speech recognition , 1989 .

[2] Esther Levin,et al. Accelerated Learning in Layered Neural Networks , 1988, Complex Syst..

[3] James Holmes,et al. The JSRU channel vocoder , 1980 .

[4] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .