An architecture for handwritten text recognition systems

Abstract. This paper presents an end-to-end system for reading handwritten page images. Five functional modules included in the system are introduced in this paper: (i) pre-processing, which concerns introducing an image representation for easy manipulation of large page images and image handling procedures using the image representation; (ii) line separation, concerning text line detection and extracting images of lines of text from a page image; (iii) word segmentation, which concerns locating word gaps and isolating words from a line of text image obtained efficiently and in an intelligent manner; (iv) word recognition, concerning handwritten word recognition algorithms; and (v) linguistic post-pro- cessing, which concerns the use of linguistic constraints to intelligently parse and recognize text. Key ideas employed in each functional module, which have been developed for dealing with the diversity of handwriting in its various aspects with a goal of system reliability and robustness, are described in this paper. Preliminary experiments show promising results in terms of speed and accuracy.

[1]  J.-C. Simon,et al.  Off-line cursive word recognition , 1992, Proc. IEEE.

[2]  Gyeonghwan Kim,et al.  Handwritten word recognition using dynamic matching with variable duration , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[3]  Donggang Yu,et al.  An efficient algorithm for smoothing, linearization and detection of structural feature points of binary image contours , 1997, Pattern Recognit..

[4]  Gyeonghwan Kim,et al.  Handwritten phrase recognition as applied to street name images , 1998, Pattern Recognit..

[5]  A. C. Downton,et al.  A Handwritten Form Reader Architecture , 1999 .

[6]  Brent B Welch,et al.  Practical Programming in Tcl and Tk , 1999 .

[7]  Djamel Bouchaffra,et al.  Incorporating diverse information sources in handwriting recognition postprocessing , 1996 .

[8]  Giovanni Seni,et al.  External word segmentation of off-line handwritten text lines , 1994, Pattern Recognit..

[9]  Gyeonghwan Kim,et al.  A Lexicon Driven Approach to Handwritten Word Recognition for Real-Time Applications , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Sargur N. Srihari,et al.  Handprinted character/digit recognition using a multiple feature/resolution philos-ophy , 1994 .

[11]  Laurence Likforman-Sulem,et al.  A Hough based algorithm for extracting text lines in handwritten documents , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[12]  Sargur N. Srihari,et al.  Off-Line Cursive Script Word Recognition , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Herbert Freeman,et al.  Computer Processing of Line-Drawing Images , 1974, CSUR.

[14]  Alexander Filatov,et al.  HANDWRITTEN WORD RECOGNITION - THE APPROACH PROVED BY PRACTICE , 1999 .

[15]  Jean-Michel Bertille,et al.  Handwritten Word Recognition with Contextual Hidden Markov Models , 1999 .

[16]  Venu Govindaraju,et al.  Efficient chain-code-based image manipulation for handwritten word recognition , 1996, Electronic Imaging.

[17]  Uma Mahadevan,et al.  Gap metrics for word separation in handwritten lines , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.