Reliable Steganalysis Using a Minimum Set of Samples and Features

This paper proposes to determine a sufficient number of images for reliable classification and to use feature selection to select most relevant features for achieving reliable steganalysis. First dimensionality issues in the context of classification are outlined, and the impact of the different parameters of a steganalysis scheme (the number of samples, the number of features, the steganography method, and the embedding rate) is studied. On one hand, it is shown that, using Bootstrap simulations, the standard deviation of the classification results can be very important if too small training sets are used; moreover a minimum of 5000 images is needed in order to perform reliable steganalysis. On the other hand, we show how the feature selection process using the OP-ELM classifier enables both to reduce the dimensionality of the data and to highlight weaknesses and advantages of the six most popular steganographic algorithms.

[1]  Jessica J. Fridrich,et al.  Feature-Based Steganalysis for JPEG Images and Its Implications for Future Design of Steganographic Schemes , 2004, Information Hiding.

[2]  Phil Sallee,et al.  Model-Based Steganography , 2003, IWDW.

[3]  Christian Cachin,et al.  An information-theoretic model for steganography , 1998, Inf. Comput..

[4]  Andreas Westfeld,et al.  F5-A Steganographic Algorithm , 2001, Information Hiding.

[5]  Constantine Kotropoulos,et al.  Fast sequential floating forward selection applied to emotional speech features estimated on DES and SUSAS data collections , 2006, 2006 14th European Signal Processing Conference.

[6]  Tomás Pevný,et al.  Merging Markov and DCT features for multi-class JPEG steganalysis , 2007, Electronic Imaging.

[7]  Constantine Kotropoulos,et al.  Fast and accurate sequential floating forward feature selection with the Bayes classifier applied to speech emotion recognition , 2008, Signal Process..

[8]  Amaury Lendasse,et al.  A Feature Selection Methodology for Steganalysis , 2006, MRCS.

[9]  Yun Q. Shi,et al.  A Markov Process Based Approach to Effective Attacking JPEG Steganography , 2006, Information Hiding.

[10]  Sos S. Agaian,et al.  New multilevel DCT, feature vectors, and universal blind steganalysis , 2005, IS&T/SPIE Electronic Imaging.

[11]  Amaury Lendasse,et al.  Extracting relevant features of steganographic schemes by feature selection techniques , 2007 .

[12]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[13]  S. T. Buckland,et al.  An Introduction to the Bootstrap , 1994 .

[14]  Niels Provos,et al.  Detecting Steganographic Content on the Internet , 2002, NDSS.

[15]  Andreas Pfitzmann,et al.  Attacks on Steganographic Systems , 1999, Information Hiding.

[16]  Andrew D. Ker The ultimate steganalysis benchmark? , 2007, MM&Sec.

[17]  Amaury Lendasse,et al.  Long-term prediction of time series using NNE-based projection and OP-ELM , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[18]  Siwei Lyu,et al.  Detecting Hidden Messages Using Higher-Order Statistics and Support Vector Machines , 2002, Information Hiding.

[19]  Jessica J. Fridrich,et al.  The square root law of steganographic capacity for Markov covers , 2009, Electronic Imaging.

[20]  Amaury Lendasse,et al.  A Methodology for Building Regression Models using Extreme Learning Machine: OP-ELM , 2008, ESANN.

[21]  Qingzhong Liu,et al.  Image complexity and feature mining for steganalysis of least significant bit matching steganography , 2008, Inf. Sci..

[22]  Richard Bellman,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[23]  Niels Provos,et al.  Defending Against Statistical Steganalysis , 2001, USENIX Security Symposium.

[24]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[25]  Petra Mutzel,et al.  A Graph-Theoretic Approach to Steganography , 2005, Communications and Multimedia Security.

[26]  Michel Verleysen,et al.  Mutual information for the selection of relevant variables in spectrometric nonlinear modelling , 2006, ArXiv.

[27]  D. François High-dimensional data analysis : optimal metrics and feature selection/ , 2007 .

[28]  Dana S. Richards,et al.  Modified Matrix Encoding Technique for Minimal Distortion Steganography , 2006, Information Hiding.