Generation of synthetic training data for an HMM-based handwriting recognition system

A perturbation model for generating synthetic text lines from existing cursively handwritten lines of text produced by human writers is presented. Our purpose is to improve the performance of an HMM-based off-line cursive handwriting recognition system by providing it with additional synthetic training data. Two kinds of perturbations are applied, geometrical transformations and thinning/thickening operations. The proposed perturbation model is evaluated under different experimental conditions.

[1]  Antoine Manzanera,et al.  Improved low complexity fully parallel thinning algorithm , 1999, Proceedings 10th International Conference on Image Analysis and Processing.

[2]  Horst Bunke,et al.  Using a Statistical Language Model to Improve the Performance of an HMM-Based Cursive Handwriting Recognition System , 2001, Int. J. Pattern Recognit. Artif. Intell..

[3]  Pierre Soille,et al.  Morphological Image Analysis: Principles and Applications , 2003 .

[4]  Horst Bunke,et al.  Off-Line, Handwritten Numeral Recognition by Perturbation Method , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Pierre Soille,et al.  Morphological Image Analysis , 1999 .

[6]  Proceedings Seventh International Conference on Document Analysis and Recognition , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[7]  Minoru Mori,et al.  GENERATING NEW SAMPLES FROM HANDWRITTEN NUMERALS BASED ON POINT CORRESPONDENCE , 2004 .

[8]  Yann LeCun,et al.  Efficient Pattern Recognition Using a New Transformation Distance , 1992, NIPS.

[9]  John Bennett,et al.  The effect of large training set sizes on online Japanese Kanji and English cursive recognizers , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[10]  Rafael Llobet,et al.  Training Set Expansion in Handwritten Character Recognition , 2002, SSPR/SPR.

[11]  Horst Bunke,et al.  The IAM-database: an English sentence database for offline handwriting recognition , 2002, International Journal on Document Analysis and Recognition.