Discriminative training of the pronunciation networks

Presents a new approach for a pronunciation network construction. The structure of a pronunciation network is determined as a result of the acoustical data decoding procedure that evaluates a list of the N most probable strings of pronunciation units (SPUs), such as phonemes. The importance of each of the decoded strings is characterized by a set of weight coefficients prescribed to phonemes or to some part of the phonemes. The optimality of the weight coefficients is defined in the framework of discriminative training, and the use of the minimum classification error (MCE) criterion allows us to maximize the discrimination between different pronunciation networks.