On-line recognition of spoken words from a large vocabulary

Abstract It is demonstrated in this paper that a real-time, large-vocabulary, isolated-word speech recognition system can effectively be implemented using the following two-stage organization: 1. (1) conversion of the speech signal into phonemic transcriptions, 2. (2) recognition of phonemic transcriptions by advanced searching methods. A comparison of several alternatives for the first stage has indicated that the best accuracy is achieved by the learning-subspace method For the second stage the authors recommend fast string searching by redundant hash addressing combined with subsequent probabilistic analysis. The above system has been implemented in a minicomputer environment.

[1]  Rangasami L. Kashyap,et al.  Syntactic Decision Rules for Recognition of Spoken Words and Phrases Using a Stochastic Automaton , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  G. White,et al.  Speech recognition experiments with linear predication, bandpass filtering, and dynamic programming , 1976 .

[3]  Teuvo Kohonen,et al.  Content-addressable memories , 1980 .

[4]  N. Dixon,et al.  A comparison of several speech-spectra classification methods , 1976 .

[5]  Erkki Reuhkala,et al.  Experiments on an isolated-word recognition system for multiple speakers , 1981, ICASSP.

[6]  Teuvo Kohonen,et al.  Self-Organization and Associative Memory , 1988 .

[7]  Teuvo Kohonen,et al.  Associative memory. A system-theoretical approach , 1977 .

[8]  Xerox Corpora,et al.  Speech Recognition Experiments with Linear Predication, Bandpass Filtering, and Dynamic Programming , 1975 .

[9]  Seppo Haltsonen,et al.  Improvement and comparison of three phonemic segmentation methods of speech , 1981, ICASSP.

[10]  Heikki Riittinen,et al.  Spectral classification of phonemes by learning subspaces , 1979, ICASSP.

[11]  P. Castelaz,et al.  A comparison of linear prediction, FFT, and zero-crossing analysis techniques for vowel recognition , 1978, ICASSP.

[12]  Tamotsu Kasai,et al.  A Method for the Correction of Garbled Words Based on the Levenshtein Metric , 1976, IEEE Transactions on Computers.

[13]  Lalit R. Bahl,et al.  Decoding for channels with insertions, deletions, and substitutions with applications to speech recognition , 1975, IEEE Trans. Inf. Theory.

[14]  Olli Ventä,et al.  The on-line version of the Otaniemi speech recognition system , 1982, ICASSP.