Preprocessing and feature extraction for a handwriting recognition system

Offline cursive script word recognition has received increasing attention during the last years. Impressive progress has been achieved in reading isolated single characters during the last decade. Cursive script recognition still lacks a good recognition rate. Since there is a high variability in unconstrainted handwritten script words, the domain is much more difficult than single character recognition. To achieve acceptable results, the context has to be restricted by a given lexicon of all possible words. The only accessible information is the binary image of the cursive script word. Since handling of raster data is cumbersome, connectivity analysis is applied as a first processing step. Thereafter it is necessary to reduce the variability as much as possible without losing relevant information. Therefore, some normalization steps angle, rotation stroke width, and size. The normalization techniques of the authors' system and the subsequent feature extraction are presented. The proposed algorithms are every efficient because they are based on the contour information provided by connectivity analysis.<<ETX>>

[1]  Torsten Caesar,et al.  Sophisticated topology of hidden Markov models for cursive script recognition , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[2]  Paul C. K. Kwok,et al.  A thinning algorithm by contour generation , 1988, CACM.

[3]  Barry A. Blesser,et al.  Skeletons: A link between theoretical and physical letter descriptions , 1982, Pattern Recognit..

[4]  Ching Y. Suen,et al.  A fast parallel algorithm for thinning digital patterns , 1984, CACM.

[5]  Patrick J. Grother,et al.  The First Census Optical Character Recognition Systems Conference | NIST , 1992 .

[6]  J. M. Gloger,et al.  Use of the Hough transform to separate merged text/graphics in forms , 1992, Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol.II. Conference B: Pattern Recognition Methodology and Systems.

[7]  Sargur N. Srihari,et al.  Off-Line Cursive Script Word Recognition , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Ralph Roskies,et al.  Fourier Descriptors for Plane Closed Curves , 1972, IEEE Transactions on Computers.