Speech Recognition by Composition of Weighted Finite Automata

We present a general framework based on weighted finite automata and weighted finite-state transducers for describing and implementing speech recognizers. The framework allows us to represent uniformly the information sources and data structures used in recognition, including context-dependent units, pronunciation dictionaries, language models and lattices. Furthermore, general but efficient algorithms can used for combining information sources in actual recognizers and for optimizing their application. In particular, a single composition algorithm is used both to combine in advance information sources such as language models and dictionaries, and to combine acoustic observations and information sources dynamically during recognition.

[1]  Ray Teitelbaum,et al.  Context-free error analysis by evaluation of algebraic power series , 1973, STOC.

[2]  Taylor L. Booth,et al.  Applying Probability Measures to Abstract Languages , 1973, IEEE Transactions on Computers.

[3]  Jean Berstel,et al.  Transductions and context-free languages , 1979, Teubner Studienbücher : Informatik.

[4]  Lalit R. Bahl,et al.  A Maximum Likelihood Approach to Continuous Speech Recognition , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Arto Salomaa,et al.  Semirings, Automata, Languages , 1985, EATCS Monographs on Theoretical Computer Science.

[6]  Jean Berstel,et al.  Rational series and their languages , 1988, EATCS monographs on theoretical computer science.

[7]  Michael A. Arbib,et al.  An Introduction to Formal Language Theory , 1988, Texts and Monographs in Computer Science.

[8]  Bernard Lang A Generative View of Ill-Formed Input Processing , 1989 .

[9]  Andrej Ljolje,et al.  Optimal speech recognition using phone recognition and lexical access , 1992, ICSLP.

[10]  Emmanuel Roche Analyse syntaxique transformationnelle du francais par transducteurs et lexique-grammaire , 1993 .

[11]  Max Silberztein,et al.  Dictionnaires électroniques et analyse automatique de textes : le système intex , 1993 .

[12]  Fernando Pereira,et al.  Weighted Rational Transductions and their Application to Human Language Processing , 1994, HLT.

[13]  Chilin Shih,et al.  A Stochastic Finite-State Word-Segmentation Algorithm for Chinese , 1994, ACL.

[14]  Martin Kay,et al.  Regular Models of Phonological Rule Systems , 1994, CL.

[15]  Mehryar Mohri Compact Representations by Finite-State Transducers , 1994, ACL.

[16]  Mehryar Mohri,et al.  Syntactic Analysis by Local Grammars Automata: an Efficient Algorithm , 1994, ArXiv.

[17]  Fernando Pereira,et al.  The AT&t 60,000 word speech-to-text system , 1995, EUROSPEECH.

[18]  Roberto Pieraccini,et al.  Non-deterministic stochastic language models for speech recognition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[19]  Yves Schabes,et al.  On the Use of Sequential Transducers in Natural Language Processing , 1997 .