Key-phrase detection and verification for flexible speech understanding

A novel framework of robust speech understanding is presented. It is based on a detection and verification strategy. It extracts the semantically significant parts and rejects the irrelevant parts rather than decoding the whole utterances. There are two key features in the strategy. Firstly, the discriminative verifier is integrated to suppress false alarms. It uses anti-subword models specifically trained to verify the recognition results. The second feature is the use of a key-phrase network as the detection unit. It embeds a stochastic constraint of keyword and key-phrase connections to improve the coverage and detection rates. The automatic generation of the key-phrase network structure is also addressed. This top-down variable-length language model can be trained with a small corpus and ported to different tasks. This property coupled with the vocabulary-independent detector and verifier enhances the portability of the framework.

[1]  Tatsuya Kawahara,et al.  Concept-based phrase spotting approach for spontaneous speech understanding , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[2]  Michael Weintraub,et al.  Keyword-spotting using SRI's DECIPHER large-vocabulary speech-recognition system , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Chin-Hui Lee,et al.  Vocabulary independent discriminative utterance verification for nonkeyword rejection in subword based speech recognition , 1996, IEEE Trans. Speech Audio Process..

[4]  Egidio P. Giachin,et al.  Phrase bigrams for continuous speech recognition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[5]  Biing-Hwang Juang,et al.  An algorithm of high resolution and efficient multiple string hypothesization for continuous speech recognition using inter-word models , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Alexander H. Waibel,et al.  Towards better language models for spontaneous speech , 1994, ICSLP.

[7]  Frédéric Bimbot,et al.  Language modeling by variable length sequences: theoretical formulation and evaluation of multigrams , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[8]  Herbert Gish,et al.  Phonetic training and language modeling for word spotting , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Chin-Hui Lee,et al.  Automatic recognition of keywords in unconstrained speech using hidden Markov models , 1990, IEEE Trans. Acoust. Speech Signal Process..

[10]  R. C. Rose,et al.  Keyword detection in conversational speech utterances using hidden Markov model based continuous speech recognition , 1995, Comput. Speech Lang..

[11]  Richard Rose,et al.  A hidden Markov model based keyword recognition system , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[12]  Chin-Hui Lee,et al.  Vocabulary independent discriminative utterance verification for non-keyword rejection in subword based speech recognition , 1998 .