On the design of connectionist vector quantizers

The Kohonen self-organizing feature map has been widely used for the design of connectionist vector quantizers (VQ). One of the features of the Kohonen algorithm is that the weight update gain sequence η(m) is a decreasing function of the number of iterations, and if incorrectly chosen can lead to very long training times. Here we derive the time-optimal gain sequence and demonstrate its efficacy for a number of cases. The performance is demonstrated using a Gauss-Markov source and compared with a VQ designed using the LBG algorithm. It is demonstrated that the new method is time optimal and that its performance tends to that of the VQ with the LBG algorithm. The special case of a connectionist VQ for linear predictive or auto-regressive data is analysed. The optimal design is derived for the Itakura distance measures. It is again compared with the LBG design. Two connectionist models for finite-state vector quantizers (FSVQ) are proposed. These models learn not only the distribution but also the conditional distribution of a source. The connectionist FSVQs exploit the redundancy between the vectors (frames) of a highly correlated source, and further improve the rate-distortion performance. Their architectures and learning algorithms are presented. To evaluate the performance of the proposed quantizers, comparisons between quantized and unquantized speech spectra are presented. The connectionist VQ and FSVQ for linear predictive data are finally applied to a multipulse linear predictive speech coder using data from the TIMIT database. Comparisons are made of waveforms and rate-distortion functions.

[1]  A. Gray,et al.  Distance measures for speech processing , 1976 .

[2]  Robert M. Gray,et al.  An Algorithm for the Design of Labeled-Transition Finite-State Vector Quantizers , 1985, IEEE Trans. Commun..

[3]  Josef Kittler,et al.  A comparative study of the Kohonen and multiedit neural net learning algorithms , 1989 .

[4]  Bishnu S. Atal,et al.  Amplitude optimization and pitch prediction in multipulse coders , 1989, IEEE Trans. Acoust. Speech Signal Process..

[5]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[6]  Allen Gersho,et al.  Vector Predictive Coding of Speech at 16 kbits/s , 1985, IEEE Trans. Commun..

[7]  K. Ganesan,et al.  Comparative study of algorithms for VQ design using conventional and neural-net based approaches , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[8]  Peter Brauer,et al.  Infrastructure in Kohonen maps , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[9]  F. Fallside On the analysis of multidimensional linear predictive autoregressive data by a class of single layer connectionist models , 1989 .

[10]  Ed F. Deprettere,et al.  A class of analysis-by-synthesis predictive coders for high quality speech coding at rates between 4.8 and 16 kbit/s , 1988, IEEE J. Sel. Areas Commun..

[11]  J J Hopfield,et al.  Learning algorithms and probability distributions in feed-forward and feed-back networks. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[12]  B. Atal,et al.  Speech analysis and synthesis by linear prediction of the speech wave. , 1971, The Journal of the Acoustical Society of America.

[13]  Frank Fallside,et al.  Analysis of linear predictive data as speech and of ARMA processes by a class of single-layer connectionist models , 1989, NATO Neurocomputing.

[14]  Bishnu S. Atal,et al.  A new model of LPC excitation for producing natural-sounding speech at low bit rates , 1982, ICASSP.