A Method for Automatic Extraction of Fujisaki-Model Parameters

The utility of a model describing pitch profiles in speech signals is of fundamental importance in many application areas and especially in natural-sounding text-to-speech system. Fujisaki-model [1] has shown considerable accuracy on many languages, despite its simplicity. The inverse problem, i.e. the extraction of the input parameters which generated an observed pitch contour, that could be of great interest in the field of automatic extraction of prosodic parameters from a given speech signal, is a much harder task. This paper suggests a method for input parameters estimation based on two steps: an initial guessing algorithm based on relative extremes, and a refinement procedure based on a gradient optimization algorithm. Preliminary results of analysis/synthesis of pitch contours show excellent performance of the proposed method.

[1]  Eva Navas,et al.  Modelling Basque intonation using Fujisaki's model and carts , 2000 .

[2]  Hiroya Fujisaki,et al.  Dynamic Characteristics of Voice Fundamental Frequency in Speech and Singing , 1983 .

[3]  Hansjörg Mixdorff,et al.  A novel approach to the fully automatic extraction of Fujisaki model parameters , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[4]  H. Fujisaki,et al.  The use of a generative model of F/sub 0/ contours for multilingual speech synthesis , 1998, ICSP '98. 1998 Fourth International Conference on Signal Processing (Cat. No.98TH8344).

[5]  Keikichi Hirose,et al.  Detection of phrase boundaries in Japanese by low-pass filtering of fundamental frequency contours , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[6]  Sumio Ohno,et al.  A method for automatic extraction of parameters of the fundamental frequency contour , 2000, INTERSPEECH.

[7]  Steven Greenberg,et al.  Speaking in shorthand - A syllable-centric perspective for understanding pronunciation variation , 1999, Speech Commun..

[8]  Juan Manuel Montero-Martínez,et al.  New rule-based and data-driven strategy to incorporate Fujisaki's F/sub 0/ model to a text-to-speech system in Castillian Spanish , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[9]  Hiroshi Murata,et al.  Analysis and modeling of word accent and sentence intonation in Swedish , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  M. Tatham,et al.  Intonation for synthesis of speaking styles , 2000 .