Finite-State Transducers in Language and Speech Processing

Finite-machines have been used in various domains of natural language processing. We consider here the use of a type of transducer that supports very efficient programs: sequential transducers. We recall classical theorems and give new ones characterizing sequential string-to-string transducers. Transducers that outpur weights also play an important role in language and speech processing. We give a specific study of string-to-weight transducers, including algorithms for determinizing and minizizing these transducers very efficiently, and characterizations of the transducers admitting determinization and the corresponding algorithms. Some applications of these algorithms in speech recognition are described and illustrated.

[1]  A. Nerode,et al.  Linear automaton transformations , 1958 .

[2]  Marcel Paul Schützenberger,et al.  On the Definition of a Family of Automata , 1961, Inf. Control..

[3]  J. Brzozowski Canonical regular expressions and minimal state graphs for definite events , 1962 .

[4]  Jorge E. Mezei,et al.  On Relations Defined by Generalized Finite Automata , 1965, IBM J. Res. Dev..

[5]  S. Ginsburg,et al.  A Characterization of Machine Mappings , 1966, Canadian Journal of Mathematics.

[6]  William A. Woods,et al.  Computational Linguistics Transition Network Grammars for Natural Language Analysis , 2022 .

[7]  Jack W. Carlyle,et al.  Realizations by Stochastic Finite Automata , 1971, J. Comput. Syst. Sci..

[8]  Alfred V. Aho,et al.  The Design and Analysis of Computer Algorithms , 1974 .

[9]  Marcel Paul Schützenberger,et al.  Sur une Variante des Fonctions Sequentielles , 1977, Theor. Comput. Sci..

[10]  Arto Salomaa,et al.  Automata-Theoretic Aspects of Formal Power Series , 1978, Texts and Monographs in Computer Science.

[11]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[12]  Jean Berstel,et al.  Transductions and context-free languages , 1979, Teubner Studienbücher : Informatik.

[13]  Arto Salomaa,et al.  Semirings, Automata and Languages , 1985 .

[14]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[15]  Arto Salomaa,et al.  Semirings, Automata, Languages , 1985, EATCS Monographs on Theoretical Computer Science.

[16]  Maxime Crochemore,et al.  Transducers and Repetitions , 1986, Theor. Comput. Sci..

[17]  Maurice Gross,et al.  The Use of Finite Automata in the Lexical Representaion of Natural Language , 1987, Electronic Dictionaries and Automata in Computational Linguistics.

[18]  Imre Simon The Nondeterministic Complexity of a Finite Automaton , 1987 .

[19]  Wilfried Brauer,et al.  On Minimizing Finite Automata , 1988, Bull. EATCS.

[20]  M. P. Schützenberger,et al.  Polynomial decomposition of rational functions , 1988 .

[21]  Jean Berstel,et al.  Rational series and their languages , 1988, EATCS monographs on theoretical computer science.

[22]  Friedrich J. Urbanek,et al.  On minimizing finite automata , 1989, Bull. EATCS.

[23]  Dominique Perrin,et al.  Finite Automata , 1958, Philosophy.

[24]  Kurt Mehlhorn,et al.  Faster algorithms for the shortest path problem , 1990, JACM.

[25]  Jan van Leeuwen,et al.  Formal models and semantics , 1990 .

[26]  Wolfgang Thomas,et al.  Handbook of Theoretical Computer Science, Volume B: Formal Models and Semantics , 1990 .

[27]  Dominique Revuz,et al.  Minimisation of Acyclic Deterministic Automata in Linear Time , 1992, Theor. Comput. Sci..

[28]  Lauri Karttunen,et al.  Two-Level Morphology with Composition , 1992, COLING.

[29]  Emmanuel Roche Analyse syntaxique transformationnelle du francais par transducteurs et lexique-grammaire , 1993 .

[30]  B. Watson A taxonomy of finite automata minimization algorithms , 1993 .

[31]  Max Silberztein,et al.  Dictionnaires électroniques et analyse automatique de textes : le système intex , 1993 .

[32]  Mehryar Mohri Minimization of Sequential Transducers , 1994, CPM.

[33]  Fernando Pereira,et al.  Weighted Rational Transductions and their Application to Human Language Processing , 1994, HLT.

[34]  Martin Kay,et al.  Regular Models of Phonological Rule Systems , 1994, CL.

[35]  Reinhard Klemm,et al.  Economy of Description for Single-Valued Transducers , 1994, Inf. Comput..

[36]  Daniel Krob,et al.  The Equality Problem for Rational Series with Multiplicities in the tropical Semiring is Undecidable , 1992, Int. J. Algebra Comput..

[37]  Mehryar Mohri Compact Representations by Finite-State Transducers , 1994, ACL.

[38]  Mehryar Mohri,et al.  Syntactic Analysis by Local Grammars Automata: an Efficient Algorithm , 1994, ArXiv.

[39]  Mehryar Mohri,et al.  Matching Patterns of An Automaton , 1995, CPM.

[40]  C. Reutenauer,et al.  Varieties and rational functions , 1995 .

[41]  Christophe Reutenauer,et al.  Variétés et fonctions rationnelles , 1995, Theor. Comput. Sci..

[42]  Atro Voutilainen,et al.  A language-independent system for parsing unrestricted text , 1995 .

[43]  Richard Sproat,et al.  An Efficient Compiler for Weighted Rewrite Rules , 1996, ACL.

[44]  Mehryar Mohri,et al.  On some applications of finite-state automata theory to natural language processing , 1996, Nat. Lang. Eng..

[45]  Mikkel Thorup,et al.  On RAM priority queues , 1996, SODA '96.

[46]  Yves Schabes,et al.  On the Use of Sequential Transducers in Natural Language Processing , 1997 .