OFF-LINE HANDWRITING RECOGNITION USING VARIOUS HYBRID MODELING TECHNIQUES AND CHARACTER N-GRAMS

In this paper a system for on-line cursive handwriting recognition is described. The system is based on Hidden Markov Models (HMMs) using discrete and hybrid modeling techniques. Here, we focus on two aspects of the recognition system. First, we present different hybrid modeling techniques, whereas one depends on an information theory-based neural network (MMI-criterion) used as a vector quantizer and the other uses a neural net for estimating the a posteriori probabilities to replace the codebook of a tied-mixture HMM system. This is the first paper where we present this novel approach -called tied posteriors- for handwriting recognition. Second, we demonstrate the usage of a language model, that consists of character n-grams, as an alternative to the recognition with a large dictionary of German words. Our resulting system for character recognition yields significantly better recognition results using an unlimited vocabulary.

[1]  Ronald Rosenfeld,et al.  Statistical language modeling using the CMU-cambridge toolkit , 1997, EUROSPEECH.

[2]  Horst Bunke,et al.  A full English sentence database for off-line handwriting recognition , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[3]  L. Rabiner,et al.  An introduction to hidden Markov models , 1986, IEEE ASSP Magazine.

[4]  Joachim M. Gloger,et al.  A comparison of Gaussian distribution and polynomial classifiers in a hidden Markov model based system for the recognition of cursive script , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[5]  Christoph Neukirchen,et al.  DUcoder-the Duisburg University LVCSR stackdecoder , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[6]  Gerhard Rigoll,et al.  Performance evaluation of a new hybrid modeling technique for handwriting recognition using identical on-line and off-line data , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[7]  Gerhard Rigoll,et al.  A new hybrid approach to large vocabulary cursive handwriting recognition , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[8]  Gerhard Rigoll,et al.  Tied posteriors: an approach for effective introduction of context dependency in hybrid NN/HMM LVCSR , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[9]  Lambert Schomaker,et al.  Finding features used in the human reading of cursive handwriting , 1999, International Journal on Document Analysis and Recognition.

[10]  Gerhard Rigoll,et al.  Improved degraded document recognition with hybrid modeling techniques and character n-grams , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[11]  H. Niemann,et al.  A HMM–based System for Recognition of Handwritten Address Words , 1999 .

[12]  Richard M. Schwartz,et al.  An Omnifont Open-Vocabulary OCR System for English and Arabic , 1999, IEEE Trans. Pattern Anal. Mach. Intell..