Parsing N-best lists of handwritten sentences

This paper investigates the application of a probabilistic parser for natural language on the list of the N-best sentences produced by an offline recognition system for cursive handwritten sentences. For the generation of the N-best sentence list an HMM-based recognizer including a bigram language model is used. The parsing of the sentences is achieved by a bottom-up chart parser for stochastic context-free grammars which produces the parse tree of the input sentence as well as the word tags. From a collection of corpora we extract the linguistic resources to build the lexicon, a word bigram model and the stochastic context-free grammar. Results from experiments indicate an increase of the word and sentence recognition rate when using the proposed combination scheme.

[1]  Thomas Niesler,et al.  Category-Based Statistical Language Models , 1997 .

[2]  Geoffrey Leech,et al.  The tagged LOB Corpus : user's manual , 1986 .

[3]  Johansson. Stig,et al.  Manual of information to accompany the Lancaster-Oslo : Bergen Corpus of British English, for use with digital computers , 1978 .

[4]  Naftali Tishby,et al.  Distributional Clustering of English Words , 1993, ACL.

[5]  Imed Zitouni Modélisation du langage pour les systèmes de reconnaissance de la parole destinés aux grands vocabulaires : application à MAUD , 2000 .

[6]  Steve J. Young,et al.  Class-based language model adaptation using mixtures of word-class weights , 2000, INTERSPEECH.

[7]  Horst Bunke,et al.  Automatic segmentation of the IAM off-line database for handwritten English text , 2002, Object recognition supported by user interaction for service robots.

[8]  Adwait Ratnaparkhi,et al.  A Simple Introduction to Maximum Entropy Models for Natural Language Processing , 1997 .

[9]  G. Āllport The Psycho-Biology of Language. , 1936 .

[10]  Proceedings Seventh International Conference on Document Analysis and Recognition , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[11]  Dominique Vaufreydaz Modélisation statistique du langage à partir d'Internet pour la reconnaissance automatique de la parole continue. (Statistical language modelling using Internet documents for continuous speech recognition) , 2002 .

[12]  Sargur N. Srihari,et al.  Parsing and recognition of city, state, and ZIP codes in handwritten addresses , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[13]  Joerg P. Ueberla,et al.  An extended clustering algorithm for statistical language models , 1994, IEEE Trans. Speech Audio Process..

[14]  Thomas Niesler,et al.  A variable-length category-based n-gram language model , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[15]  Ronald Rosenfeld,et al.  Incorporating linguistic structure into statistical language models , 2000, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[16]  Victor Zue,et al.  Multilingual spoken-language understanding in the MIT Voyager system , 1995, Speech Commun..

[17]  Slava M. Katz,et al.  Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[18]  Hermann Ney,et al.  Selection criteria for word trigger pairs in language modelling , 1996, ICGI.

[19]  Jean-Cédric Chappelier,et al.  A Generalized CYK Algorithm for Parsing Stochastic CFG , 1998, TAPD.

[20]  Hermann Ney,et al.  Algorithms for bigram and trigram word clustering , 1995, Speech Commun..

[21]  Mari Ostendorf,et al.  Language Modeling with Sentence-Level Mixtures , 1994, HLT.

[22]  Horst Bunke,et al.  On the influence of vocabulary size and language models in unconstrained handwritten text recognition , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[23]  Hiroshi Sako,et al.  Integrated segmentation and recognition of handwritten numerals: comparison of classification algorithms , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[24]  Gilles Adda,et al.  Automatic word classification using simulated annealing , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[25]  Robert Sabourin,et al.  A hybrid large vocabulary handwritten word recognition system using neural networks with hidden Markov models , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[26]  L. Lamel Some Issues in Speech Recognizer Portability , 2003 .

[27]  H. Soltau,et al.  Efficient handling of multilingual language models , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).

[28]  Horst Bunke,et al.  Optimizing the number of states, training iterations and Gaussians in an HMM-based handwritten word recognizer , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[29]  Emmanuel Augustin,et al.  A2iA Check Reader: a family of bank check recognition systems , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[30]  Renato De Mori,et al.  Corrections to "A Cache-Based Language Model for Speech Recognition" , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  Horst Bunke,et al.  Using a Statistical Language Model to Improve the Performance of an HMM-Based Cursive Handwriting Recognition System , 2001, Int. J. Pattern Recognit. Artif. Intell..

[32]  Chorkin Chan,et al.  Chinese Word Segmentation based on Maximum Matching and Word Binding Force , 1996, COLING.

[33]  Alex Waibel,et al.  The GlobalPhone Project: Multilingual LVCSR with JANUS-3 , 1997 .

[34]  Lillian Lee,et al.  Mostly-Unsupervised Statistical Segmentation of Japanese: Applications to Kanji , 2000, ANLP.

[35]  B. Merialdo,et al.  Tagging text with a probabilistic model , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[36]  Geoffrey Leech,et al.  Manual of Information for the Lancaster Parsed Corpus , 1999 .

[37]  Andreas Stolcke,et al.  A study of multilingual speech recognition , 1997, EUROSPEECH.

[38]  Michèle Jardino Multilingual stochastic n-gram class language models , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[39]  A. Waibel,et al.  Multilinguality in speech and spoken language systems , 2000, Proceedings of the IEEE.

[40]  Andreas Stolcke,et al.  Using a stochastic context-free grammar as a language model for speech recognition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[41]  Horst Bunke,et al.  Off-Line, Handwritten Numeral Recognition by Perturbation Method , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[42]  Ronald Rosenfeld,et al.  Scalable Trigram Backoff Language Models , 1996 .

[43]  Samy Bengio,et al.  Offline recognition of large vocabulary cursive handwritten text , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[44]  Masaki Nakagawa,et al.  An off-line character recognition method employing model-dependent pattern normalization by an elastic membrane model , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[45]  Shuntaro Isogai,et al.  Multi-Class Composite N-gram Language Model for Spoken Language Processing Using Multiple Word Clusters , 2001, ACL.

[46]  Samy Bengio,et al.  Offline recognition of unconstrained handwritten texts using HMMs and statistical language models , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Horst Bunke,et al.  Hidden Markov model length optimization for handwriting recognition systems , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[48]  Horst Bunke,et al.  The IAM-database: an English sentence database for offline handwriting recognition , 2002, International Journal on Document Analysis and Recognition.

[49]  Martin Rajman,et al.  Lattice Parsing for Speech Recognition , 1999 .

[50]  Gregory W. Lesher,et al.  EFFECTS OF NGRAM ORDER AND TRAINING TEXT SIZE ON WORD PREDICTION , 1999 .

[51]  Renato De Mori,et al.  A Cache-Based Natural Language Model for Speech Recognition , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[52]  Pasi Tapanainen,et al.  What is a word, What is a sentence? Problems of Tokenization , 1994 .

[53]  Tao Hong,et al.  Text recognition enhancement with a probabilistic lattice chart parser , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[54]  Hermann Ney,et al.  Statistical Language Modeling and Word Triggers , 1996 .