Generalized information potential criterion for adaptive system training

We have previously proposed the quadratic Renyi's error entropy as an alternative cost function for supervised adaptive system training. An entropy criterion instructs the minimization of the average information content of the error signal rather than merely trying to minimize its energy. In this paper, we propose a generalization of the error entropy criterion that enables the use of any order of Renyi's entropy and any suitable kernel function in density estimation. It is shown that the proposed entropy estimator preserves the global minimum of actual entropy. The equivalence between global optimization by convolution smoothing and the convolution by the kernel in Parzen windowing is also discussed. Simulation results are presented for time-series prediction and classification where experimental demonstration of all the theoretical concepts is presented.

[1]  John W. Fisher,et al.  A novel measure for independent component analysis (ICA) , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[2]  Patrick Sevestre,et al.  Linear Dynamic Models , 1992 .

[3]  G. Deco,et al.  An Information-Theoretic Approach to Neural Computing , 1997, Perspectives in Neural Computing.

[4]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[5]  Paul A. Viola,et al.  Empirical Entropy Manipulation for Real-World Problems , 1995, NIPS.

[6]  Shun-ichi Amari,et al.  Adaptive Online Learning Algorithms for Blind Separation: Maximum Entropy and Minimum Mutual Information , 1997, Neural Computation.

[7]  M. Pascual Understanding nonlinear dynamics , 1996 .

[8]  Emile H. L. Aarts,et al.  Simulated annealing and Boltzmann machines - a stochastic approach to combinatorial optimization and neural computing , 1990, Wiley-Interscience series in discrete mathematics and optimization.

[9]  Deniz Erdogmus,et al.  Blind source separation using Renyi's -marginal entropies , 2002, Neurocomputing.

[10]  秦 浩起,et al.  Characterization of Strange Attractor (カオスとその周辺(基研長期研究会報告)) , 1987 .

[11]  C. Diks,et al.  Detecting differences between delay vector distributions. , 1996, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[12]  Shun-ichi Amari,et al.  Differential-geometrical methods in statistics , 1985 .

[13]  Deniz Erdogmus,et al.  An error-entropy minimization algorithm for supervised training of nonlinear adaptive systems , 2002, IEEE Trans. Signal Process..

[14]  Deniz Erdoğmuş,et al.  Blind source separation using Renyi's mutual information , 2001, IEEE Signal Processing Letters.

[15]  Claude E. Shannon,et al.  A Mathematical Theory of Communications , 1948 .

[16]  A. Gardner Methods of Statistics , 1941 .

[17]  Reuven Y. Rubinstein,et al.  Simulation and the Monte Carlo method , 1981, Wiley series in probability and mathematical statistics.

[18]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[19]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[20]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[21]  Mill Johannes G.A. Van,et al.  Transmission Of Information , 1961 .

[22]  William M. Campbell,et al.  Mutual Information in Learning Feature Transformations , 2000, ICML.

[23]  Geoffrey E. Hinton,et al.  Learning representations by back-propagation errors, nature , 1986 .

[24]  P. Grassberger,et al.  Characterization of Strange Attractors , 1983 .

[25]  Deniz Erdoğmuş,et al.  COMPARISON OF ENTROPY AND MEAN SQUARE ERROR CRITERIA IN ADAPTIVE SYSTEM TRAINING USING HIGHER ORDER STATISTICS , 2004 .

[26]  Ralph Linsker,et al.  Towards an Organizing Principle for a Layered Perceptual Network , 1987, NIPS.

[27]  W. Edmonson,et al.  A global least mean square algorithm for adaptive IIR filtering , 1998 .

[28]  Scott C. Douglas,et al.  Kuicnet Algorithms for Blind Deconvolution , 1998, Neural Networks for Signal Processing VIII. Proceedings of the 1998 IEEE Signal Processing Society Workshop (Cat. No.98TH8378).

[29]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.