Linear-quadratic blind source separating structure for removing show-through in scanned documents

Digital documents are usually degraded during the scanning process due to the contents of the backside of the scanned manuscript. This is often caused by the show-through effect, i.e. the backside image that interferes with the main front side picture due to the intrinsic transparency of the paper. This phenomenon is one of the degradations that one would like to remove especially in the field of Optical Character Recognition (OCR) or document digitalization which require denoised texts as inputs. In this paper, we first propose a novel and general nonlinear model for canceling the show-through phenomenon. A nonlinear blind source separation algorithm is used for this purpose based on a new recursive and extendible structure. However, the results are restricted due to a blurring effect that appears during the scanning process due to the light transfer function of the paper. Consequently, for improving the results, we introduce a refined separating architecture for simultaneously removing the show-through and blurring effects.

[1]  Luís B. Almeida,et al.  Separating a Real-Life Nonlinear Image Mixture , 2005, J. Mach. Learn. Res..

[2]  Dinh-Tuan Pham,et al.  Markovian source separation , 2003, IEEE Trans. Signal Process..

[3]  B. Farhang-Boroujeny,et al.  Adaptive Filters: Theory and Applications , 1999 .

[4]  Gaurav Sharma,et al.  Show-through cancellation in scans of duplex printed documents , 2001, IEEE Trans. Image Process..

[5]  Chew Lim Tan,et al.  Matching of double-sided document images to remove interference , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[6]  Chew Lim Tan,et al.  Document image enhancement using directional wavelet , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[7]  João Rogério Caldas Pinto,et al.  Restoration of Double-Sided Ancient Music Documents with Bleed-Through , 2007, CIARP.

[8]  Anna Tonazzini,et al.  Bleed-Through Removal from Degraded Documents Using a Color Decorrelation Method , 2004, Document Analysis Systems.

[9]  Farnood Merrikh-Bayat,et al.  A nonlinear blind source separation solution for removing the show-through effect in the scanned documents , 2008, 2008 16th European Signal Processing Conference.

[10]  Anna Tonazzini,et al.  Fast correction of bleed-through distortion in grayscale documents by a blind source separation technique , 2007, International Journal of Document Analysis and Recognition (IJDAR).

[11]  Yannick Deville,et al.  Blind Separation of Linear-Quadratic Mixtures of Real Sources Using a Recurrent Structure , 2009, IWANN.

[12]  Mariana S. C. Almeida,et al.  Separating Nonlinear Image Mixtures using a Physical Model Trained with ICA , 2006, 2006 16th IEEE Signal Processing Society Workshop on Machine Learning for Signal Processing.

[13]  Christian Jutten,et al.  A general approach for mutual information minimization and its application to blind source separation , 2005, Signal Process..

[14]  Christian Jutten,et al.  Three easy ways for separating nonlinear mixtures? , 2004, Signal Process..

[15]  Boaz Ophir,et al.  Show-Through Cancellation in Scanned Images using Blind Source Separation Techniques , 2007, 2007 IEEE International Conference on Image Processing.

[16]  Christian Jutten,et al.  Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture , 1991, Signal Process..

[17]  Anna Tonazzini,et al.  A Markov model for blind image separation by a mean-field EM algorithm , 2006, IEEE Transactions on Image Processing.

[18]  Yannick Deville,et al.  Blind Maximum Likelihood Separation of a Linear-Quadratic Mixture , 2004, ICA.

[19]  Yujie Zhang,et al.  Linear and nonlinear ICA based on mutual information , 2007, 2007 International Symposium on Intelligent Signal Processing and Communication Systems.

[20]  C. Jutten,et al.  On the separability of nonlinear mixtures of temporally correlated sources , 2003, IEEE Signal Processing Letters.

[21]  Massoud Babaiezadeh Malmiri On blind source separation in convolutive and nonlinear mixtures , 2002 .

[22]  Anna Tonazzini,et al.  Bayesian MRF-based blind source separation of convolutive mixtures of images , 2005, 2005 13th European Signal Processing Conference.

[23]  Luís B. Almeida,et al.  MISEP -- Linear and Nonlinear ICA Based on Mutual Information , 2003, J. Mach. Learn. Res..

[24]  Christian Wolf,et al.  Document Ink Bleed-Through Removal with Two Hidden Markov Random Fields and a Single Observation Field , 2010, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Jon Atli Benediktsson,et al.  On the decomposition of Mars hyperspectral data by ICA and Bayesian positive source separation , 2008, Neurocomputing.

[26]  Hirobumi Nishida,et al.  A multiscale approach to restoring scanned color document images with show-through effects , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..