Reconstructing Cross-Cut Shredded Text Documents: A Genetic Algorithm with Splicing-Driven Reproduction

In this work we focus on reconstruction of cross-cut shredded text documents (RCCSTD), which is of high interest in the fields of forensics and archeology. A novel genetic algorithm, with splicing-driven crossover, four mutation operators, and a row-oriented elitism strategy, is proposed to improve the capability of solving RCCSTD in complex space. We also design a novel and comprehensive objective function based on both edge and empty vector-based splicing error to guarantee that the correct reconstruction always has the lowest cost value. Experiments are conducted on six RCCSTD scenarios, with experimental results showing that the proposed algorithm significantly outperforms the previous best-known algorithms for this problem.

[1]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[2]  Matthias Prandtstetter,et al.  Combining Forces to Reconstruct Strip Shredded Text Documents , 2008, Hybrid Metaheuristics.

[3]  Riccardo Leardi,et al.  Application of genetic algorithm–PLS for feature selection in spectral data sets , 2000 .

[4]  L. Darrell Whitley,et al.  A Comparison of Genetic Sequencing Operators , 1991, ICGA.

[5]  L. El-Afifi,et al.  'Hands-free interface'- a fast and accurate tracking procedure for real time human computer interaction , 2004, Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology, 2004..

[6]  David E. Goldberg,et al.  AllelesLociand the Traveling Salesman Problem , 1985, ICGA.

[7]  David E. Goldberg,et al.  Alleles, loci and the traveling salesman problem , 1985 .

[8]  Jun Zhang,et al.  Adaptive Particle Swarm Optimization , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[9]  Xiaodong Li,et al.  Cooperatively Coevolving Particle Swarms for Large Scale Optimization , 2012, IEEE Transactions on Evolutionary Computation.

[10]  Azzam Sleit,et al.  An alternative clustering approach for reconstructing cross cut shredded text documents , 2013, Telecommun. Syst..

[11]  Edson Justino,et al.  Reconstructing shredded documents through feature matching. , 2006, Forensic science international.

[12]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[13]  Günther R. Raidl,et al.  Enhancing Genetic Algorithms by a Trie-Based Complete Solution Archive , 2010, EvoCOP.

[14]  Matthias Prandtstetter,et al.  Meta-heuristics for reconstructing cross cut shredded text documents , 2009, GECCO.

[15]  Lawrence Davis,et al.  Applying Adaptive Algorithms to Epistatic Domains , 1985, IJCAI.

[16]  Yahya Rahmat-Samii,et al.  Electromagnetic Optimization by Genetic Algorithms , 1999 .

[17]  D. J. Smith,et al.  A Study of Permutation Crossover Operators on the Traveling Salesman Problem , 1987, ICGA.

[18]  M.G. Strintzis,et al.  Shredded document reconstruction using MPEG-7 standard descriptors , 2004, Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology, 2004..

[19]  Meie Shen,et al.  Differential Evolution With Two-Level Parameter Adaptation , 2014, IEEE Transactions on Cybernetics.

[20]  Chew Lim Tan,et al.  Automatic Detection of Document Script and Orientation , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[21]  Matthias Prandtstetter,et al.  A Memetic Algorithm for Reconstructing Cross-Cut Shredded Text Documents , 2010, Hybrid Metaheuristics.

[22]  Pierre Hansen,et al.  Variable Neighborhood Search , 2018, Handbook of Heuristics.

[23]  Jun Zhang,et al.  Small-world particle swarm optimization with topology adaptation , 2013, GECCO '13.