Source coding and vector quantization with codebook-excited neural networks

Abstract The quantization property of layered neural networks is studied in this paper. We first review the layered neural network-based coders or quantizers developed in recent years and show that their poor performances is due to their independent training schemes for each component in the coder. Then an alternative model named a codebook-excited neural network is proposed, where an encoded vector is approximated by the output of the network driven by one vector selected from an excitation codebook. The network and the excitation codebook are jointly trained with the error back-propagation algorithm. Simulations with a Gauss-Markov source demonstrate that the quantization performance of the codebook-excited feedforward neural network is not worse than that of the connectionist vector quantizer formed by a set of single-layer neural units which satisfied the optimal quantization conditions, and that the performance of the codebook-excited recurrent neural network is very close to the asymptotic performance bound of block quantizers. The codebook-excited neural network is applicable with any distortion measure. For a zero-mean, unit variance, memory-less Gaussian source and a squared-error measure, a 1 bit/sample two-dimensional quantizer with a codebook-excited feedforward neural network is found always to escape from the local minima and converge to the best one of the three local minima which are known to exist in the vector quantizer designed using the LBG algorithm. Moreover, due to its conformal mapping characteristic, the codebook-excited neural network can be applied to designing the vector quantizer with any required structural form on its codevectors.

[1]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[2]  K. Ganesan,et al.  Comparative study of algorithms for VQ design using conventional and neural-net based approaches , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[3]  Hervé Bourlard,et al.  Speech dynamics and recurrent neural networks , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[4]  Thomas R. Fischer,et al.  A pyramid vector quantizer , 1986, IEEE Trans. Inf. Theory.

[5]  F. Fallside On the analysis of multidimensional linear predictive autoregressive data by a class of single layer connectionist models , 1989 .

[6]  P. Noll,et al.  Multipath Search Coding of Stationary Signals with Applications to Speech , 1982, IEEE Trans. Commun..

[7]  Halbert White,et al.  Learning in Artificial Neural Networks: A Statistical Perspective , 1989, Neural Computation.

[8]  Saburo Tazaki,et al.  Asymptotic performance of block quantizers with difference distortion measures , 1980, IEEE Trans. Inf. Theory.

[9]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[10]  N. J. A. Sloane,et al.  A fast encoding method for lattice codes and quantizers , 1983, IEEE Trans. Inf. Theory.

[11]  S. C. Ahalt,et al.  Performance analysis of two image vector quantization techniques , 1989, International 1989 Joint Conference on Neural Networks.

[12]  Allen Gersho,et al.  Asymptotically optimal block quantization , 1979, IEEE Trans. Inf. Theory.

[13]  Frank Fallside,et al.  Analysis of linear predictive data as speech and of ARMA processes by a class of single-layer connectionist models , 1989, NATO Neurocomputing.

[14]  R. Gray,et al.  Product code vector quantizers for waveform and voice coding , 1984 .

[15]  Jorma Laaksonen,et al.  Variants of self-organizing maps , 1990, International 1989 Joint Conference on Neural Networks.

[16]  Allen Gersho,et al.  Gain-Adaptive Vector Quantization with Application to Speech Coding , 1987, IEEE Trans. Commun..

[17]  Biing-Hwang Juang,et al.  Multiple stage vector quantization for speech coding , 1982, ICASSP.

[18]  Manfred R. Schroeder,et al.  Code-excited linear prediction(CELP): High-quality speech at very low bit rates , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[19]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[20]  F. Fallside,et al.  Neural networks for signal processing : proceedings of the 1991 IEEE workshop , 1991 .

[21]  Lizhong Wu,et al.  On the design of connectionist vector quantizers , 1991 .

[22]  Robert M. Gray,et al.  Vector Quantizers and Predictive Quantizers for Gauss-Markov Sources , 1982, IEEE Trans. Commun..

[23]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[24]  Piero Cosi,et al.  Speech coding with multilayer networks , 1989, NATO Neurocomputing.

[25]  M. G. Rahim,et al.  Articulatory synthesis with the aid of a neural net , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[26]  Robert M. Gray,et al.  Multiple local optima in vector quantizers , 1982, IEEE Trans. Inf. Theory.

[27]  Amir F. Atiya Learning on a General Network , 1987, NIPS.

[28]  Peter Brauer,et al.  Infrastructure in Kohonen maps , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[29]  A. Gersho,et al.  Multiple-stage vector excitation coding of speech waveforms , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[30]  Anthony J. Robinson,et al.  Static and Dynamic Error Propagation Networks with Application to Speech Coding , 1987, NIPS.

[31]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[32]  Thomas R. Fischer,et al.  Geometric source coding and vector quantization , 1989, IEEE Trans. Inf. Theory.

[33]  D Zipser,et al.  Learning the hidden structure of speech. , 1988, The Journal of the Acoustical Society of America.

[34]  S. Miyake,et al.  Image data compression using a neural network model , 1989, International 1989 Joint Conference on Neural Networks.

[35]  Fernando J. Pineda,et al.  Dynamics and architecture for neural computation , 1988, J. Complex..