Self-Organizing Maps and Learning Vector Quantization for Feature Sequences

The Self-Organizing Map (SOM) and Learning Vector Quantization (LVQ) algorithms are constructed in this work for variable-length and warped feature sequences. The novelty is to associate an entire feature vector sequence, instead of a single feature vector, as a model with each SOM node. Dynamic time warping is used to obtain time-normalized distances between sequences with different lengths. Starting with random initialization, ordered feature sequence maps then ensue, and Learning Vector Quantization can be used to fine tune the prototype sequences for optimal class separation. The resulting SOM models, the prototype sequences, can then be used for the recognition as well as synthesis of patterns. Good results have been obtained in speaker-independent speech recognition.

[1]  Shane S. Sturrock,et al.  Time Warps, String Edits, and Macromolecules – The Theory and Practice of Sequence Comparison . David Sankoff and Joseph Kruskal. ISBN 1-57586-217-4. Price £13.95 (US$22·95). , 2000 .

[2]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[3]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[4]  Kari Torkkola,et al.  Using the topology-preserving properties of SOFMs in speech recognition , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[5]  Panu Somervuo,et al.  Self-organizing maps of symbol strings , 1998, Neurocomputing.

[6]  Jay G. Wilpon,et al.  Considerations in applying clustering techniques to speaker-independent word recognition. , 1979 .

[7]  R. Bellman Dynamic programming. , 1957, Science.

[8]  Teuvo Kohonen,et al.  Mapping content dependent acoustic information into context independent form by LVQ , 1994, Speech Commun..

[9]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[10]  Shigeru Katagiri,et al.  Prototype-based discriminative training for various speech units , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11]  George M. White Dynamic programming, the viterbi algorithm, and low cost speech recognition , 1978, ICASSP.

[12]  T. Kohonen,et al.  Self-Organizing Maps of Symbol Strings with Application to Speech Recognition , 1997 .

[13]  Jari Kangas TIME-DEPENDENT SELF-ORGANIZING MAPS FOR SPEECH RECOGNITION , 1991 .

[14]  Van Nostrand,et al.  Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm , 1967 .