The development of a portuguese version of a media watch system

This paper summarizes the work that has been done concerning the Portuguese language in the scope of the ALERT project during its first year. The media watch system that is the goal of this project comprises many different modules, some of them common among the three languages of the project. This paper concentrates on the definition and collection of the necessary linguistic resources for Portuguese, and the development of the speech recognition, topic and jingle detection modules. The first version of the ALERT demo for European Portuguese is also described.

[1]  Jean-Luc Gauvain,et al.  Automatic transcription of compressed broadcast audio , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[2]  Ciro Martins,et al.  A large vocabulary continuous speech recognition hybrid system for the portuguese language , 1998, ICSLP.

[3]  Jonathan G. Fiscus,et al.  NIST's 1998 topic detection and tracking evaluation (TDT2) , 1999, EUROSPEECH.

[4]  Gerhard Rigoll,et al.  New approaches to audio-visual segmentation of TV news for automatic topic retrieval , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[5]  Ciro Martins,et al.  The design of a large vocabulary speech corpus for portuguese , 1997, EUROSPEECH.

[6]  Nuno Souto,et al.  Speech recognition of broadcast news for the European Portuguese language , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[7]  Larry Gillick,et al.  A hidden Markov model approach to text segmentation and event tracking , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).