Text recognition of low-resolution document images

Cheap and versatile cameras make it possible to easily and quickly capture a wide variety of documents. However, low resolution cameras present a challenge to OCR because it is virtually impossible to do character segmentation independently from recognition. In this paper we solve these problems simultaneously by applying methods borrowed from cursive handwriting recognition. To achieve maximum robustness, we use a machine learning approach based on a convolutional neural network. When our system is combined with a language model using dynamic programming, the overall performance is in the vicinity of 80-95% word accuracy on pages captured with a 1024/spl times/768 webcam and 10-point text.

[1]  Kurt Hornik,et al.  Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks , 1990, Neural Networks.

[2]  K. S. Baird,et al.  Anatomy of a versatile page reader , 1992, Proc. IEEE.

[3]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[4]  Yoshua Bengio,et al.  LeRec: A NN/HMM Hybrid for On-Line Handwriting Recognition , 1995, Neural Computation.

[5]  Øivind Due Trier,et al.  Evaluation of Binarization Methods for Document Images , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Michael J. Taylor,et al.  Enhancement of document images from cameras , 1998, Electronic Imaging.

[7]  Alex S. Taylor,et al.  CamWorks: a video-based tool for efficient capture from paper source documents , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[8]  Majid Mirmehdi,et al.  Extracting Low Resolution Text with an Active Camera for OCR , 2001 .

[9]  Maurizio Pilu,et al.  A light-weight text image processing method for handheld embedded cameras , 2002, BMVC.

[10]  Simon M. Lucas,et al.  Fast lexicon-based word recognition in noisy index card images , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[11]  Patrice Y. Simard,et al.  Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[12]  David S. Doermann,et al.  Progress in camera-based document image analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[13]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .