论文信息 - Discriminative training of the pronunciation networks

Discriminative training of the pronunciation networks

Presents a new approach for a pronunciation network construction. The structure of a pronunciation network is determined as a result of the acoustical data decoding procedure that evaluates a list of the N most probable strings of pronunciation units (SPUs), such as phonemes. The importance of each of the decoded strings is characterized by a set of weight coefficients prescribed to phonemes or to some part of the phonemes. The optimality of the weight coefficients is defined in the framework of discriminative training, and the use of the minimum classification error (MCE) criterion allows us to maximize the discrimination between different pronunciation networks.

Biing-Hwang Juang | F. Korkmazskiy

[1] Chin-Hui Lee,et al. Speech recognition using weighted HMM and subspace projection approaches , 1994, IEEE Trans. Speech Audio Process..

[2] Biing-Hwang Juang,et al. Discriminative learning for minimum error classification [pattern recognition] , 1992, IEEE Trans. Signal Process..

[3] Biing-Hwang Juang,et al. Discriminative training of dynamic programming based speech recognizers , 1993, IEEE Trans. Speech Audio Process..

[4] Yoshinori Sagisaka,et al. Automatic generation of a pronunciation dictionary based on a pronunciation network , 1997, EUROSPEECH.

[5] Luis A. Hernández Gómez,et al. Automatic alternative transcription generation and vocabulary selection for flexible word recognizers , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6] Jean-Pierre Martens,et al. Automatic rule-based generation of word pronunciation networks , 1997, EUROSPEECH.