论文信息 - Best practices for convolutional neural networks applied to visual document analysis

Best practices for convolutional neural networks applied to visual document analysis

Neural networks are a powerful technology forclassification of visual inputs arising from documents.However, there is a confusing plethora of different neuralnetwork methods that are used in the literature and inindustry. This paper describes a set of concrete bestpractices that document analysis researchers can use toget good results with neural networks. The mostimportant practice is getting a training set as large aspossible: we expand the training set by adding a newform of distorted data. The next most important practiceis that convolutional neural networks are better suited forvisual document tasks than fully connected networks. Wepropose that a simple "do-it-yourself" implementation ofconvolution with a flexible architecture is suitable formany visual document problems. This simpleconvolutional neural network does not require complexmethods, such as momentum, weight decay, structure-dependentlearning rates, averaging layers, tangent prop,or even finely-tuning the architecture. The end result is avery simple yet general architecture which can yieldstate-of-the-art performance for document analysis. Weillustrate our claims on the MNIST set of English digitimages.

Patrice Y. Simard | David Steinkraus | John C. Platt | P. Simard | David Steinkraus

[1] Kurt Hornik,et al. Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks , 1990, Neural Networks.

[2] Heekuck Oh,et al. Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[3] Richard F. Lyon,et al. Effective Training of a Neural Network Character Classifier for Word Recognition , 1996, NIPS.

[4] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[5] Anshu Sinha,et al. An improved recognition module for the identification of handwritten digits , 1999 .

[6] Yong Haur Tay,et al. An offline cursive handwritten word recognition system , 2001, Proceedings of IEEE Region 10 International Conference on Electrical and Electronic Technology. TENCON 2001 (Cat. No.01CH37239).

[7] Michele Banko,et al. Mitigating the Paucity-of-Data Problem: Exploring the Effect of Training Corpus Size on Classifier Performance for Natural Language Processing , 2001, HLT.

[8] Proceedings Seventh International Conference on Document Analysis and Recognition , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[9] Bernhard Schölkopf,et al. Training Invariant Support Vector Machines , 2002, Machine Learning.

[10] Yann LeCun,et al. The mnist database of handwritten digits , 2005 .