A study of lip movements during spontaneous dialog and its application to voice activity detection.
暂无分享,去创建一个
Christian Jutten | Laurent Girin | Jean-Luc Schwartz | Bertrand Rivet | David Sodoyer | Christophe Savariaux | C. Jutten | J. Schwartz | Laurent Girin | C. Savariaux | B. Rivet | D. Sodoyer
[1] Climent Nadeu,et al. Automatic Speech Activity Detection, Source Localization, and Speech Recognition on the Chil Seminar Corpus , 2005, 2005 IEEE International Conference on Multimedia and Expo.
[2] Christian Jutten,et al. An Analysis of Visual Speech Information Applied to Voice Activity Detection , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[3] Christian Jutten,et al. Two novel visual voice activity detectors based on appearance models and retinal filtering , 2007, 2007 15th European Signal Processing Conference.
[4] Javier Ramírez,et al. Efficient voice activity detection algorithms using long-term speech information , 2004, Speech Commun..
[5] J. Schwartz,et al. Seeing to hear better: evidence for early audio-visual interactions in speech identification , 2004, Cognition.
[6] H. Lane,et al. The Lombard Sign and the Role of Hearing in Speech , 1971 .
[7] Tsuhan Chen,et al. Cross-Modal Predictive Coding for Talking Head Sequences , 1996 .
[8] Zhu Liu,et al. Integration of multimodal features for video scene classification based on HMM , 1999, 1999 IEEE Third Workshop on Multimedia Signal Processing (Cat. No.99TH8451).
[9] Christian Jutten,et al. Mixing Audiovisual Speech Processing and Blind Source Separation for the Extraction of Speech Signals From Convolutive Mixtures , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[10] Abeer Alwan,et al. On the Relationship between Face Movements, Tongue Movements, and Speech Acoustics , 2002, EURASIP J. Adv. Signal Process..
[11] C. Neti,et al. A vision-based microphone switch for speech intent detection , 2001, Proceedings IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems.
[12] Christian Jutten,et al. Separation of Audio-Visual Speech Sources: A New Approach Exploiting the Audio-Visual Coherence of Speech Stimuli , 2002, EURASIP J. Adv. Signal Process..
[13] Laurent Girin. Joint matrix quantization of face parameters and LPC coefficients for low bit rate audiovisual speech coding , 2004, IEEE Transactions on Speech and Audio Processing.
[14] Roland Göcke,et al. Statistical analysis of the relationship between audio and video speech parameters for Australian English , 2003, AVSP.
[15] Jeesun Kim,et al. Investigating the audio-visual speech detection advantage , 2004, Speech Commun..
[16] J. A. Johnson,et al. Point-light facial displays enhance comprehension of speech in noise. , 1996, Journal of speech and hearing research.
[17] J Robert-Ribes,et al. Complementarity and synergy in bimodal speech: auditory, visual, and audio-visual identification of French oral vowels in noise. , 1998, The Journal of the Acoustical Society of America.
[18] R. Campbell,et al. Reading Speech from Still and Moving Faces: The Neural Substrates of Visible Speech , 2003, Journal of Cognitive Neuroscience.
[19] Yannick Deville,et al. Blind separation of dependent sources using the "time-frequency ratio of mixtures" approach , 2003, Seventh International Symposium on Signal Processing and Its Applications, 2003. Proceedings..
[20] W. H. Sumby,et al. Visual contribution to speech intelligibility in noise , 1954 .
[21] J L Schwartz,et al. Audio-visual enhancement of speech in noise. , 2001, The Journal of the Acoustical Society of America.
[22] Piero Cosi,et al. LUCIA a new italian talking-head based on a modified cohen-massaro's labial coarticulation model , 2003, INTERSPEECH.
[23] C. Benoît,et al. Effects of phonetic context on audio-visual intelligibility of French. , 1994, Journal of speech and hearing research.
[24] Eric Vatikiotis-Bateson,et al. The moving face during speech communication , 1998 .
[25] Q. Summerfield. Some preliminaries to a comprehensive account of audio-visual speech perception. , 1987 .
[26] Q Summerfield,et al. Use of Visual Information for Phonetic Perception , 1979, Phonetica.
[27] P. Bertelson. Chapter 14 Ventriloquism: A case of crossmodal perceptual grouping , 1999 .
[28] Chalapathy Neti,et al. Audio-visual speech enhancement with AVCDCN (audio-visual codebook dependent cepstral normalization) , 2002, Sensor Array and Multichannel Signal Processing Workshop Proceedings, 2002.
[29] Régine Le Bouquin-Jeannès,et al. Study of a voice activity detector and its influence on a noise reduction system , 1995, Speech Commun..
[30] Eric David Petajan,et al. Automatic Lipreading to Enhance Speech Recognition (Speech Reading) , 1984 .
[31] Christian Abry,et al. "Laws" for lips , 1986, Speech Commun..
[32] Jon Barker,et al. Estimation of speech acoustics from visual speech features: A comparison of linear and non-linear models , 1999, AVSP.
[33] Chalapathy Neti,et al. Joint audio-visual speech processing for recognition and enhancement , 2003, AVSP.
[34] Wonyong Sung,et al. A statistical model-based voice activity detection , 1999, IEEE Signal Processing Letters.
[35] P F Seitz,et al. The use of visible speech cues for improving auditory detection of spoken sentences. , 2000, The Journal of the Acoustical Society of America.
[36] M A Goodale,et al. Dynamic visual speech perception in a patient with visual form agnosia , 2002, Neuroreport.
[37] Javier Ramírez,et al. Statistical voice activity detection using a multiple observation likelihood ratio test , 2005, IEEE Signal Processing Letters.
[38] Chalapathy Neti,et al. Audio-visual intent-to-speak detection for human-computer interaction , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[39] C. Benoît,et al. A set of French visemes for visual speech synthesis , 1994 .
[40] P. Gribble,et al. Temporal constraints on the McGurk effect , 1996, Perception & psychophysics.
[41] S. Gökhun Tanyer,et al. Voice activity detection in nonstationary noise , 2000, IEEE Trans. Speech Audio Process..
[42] N. Campbell. APPROACHES TO CONVERSATIONAL SPEECH RHYTHM: SPEECH ACTIVITY IN TWO-PERSON TELEPHONE DIALOGES , 2007 .
[43] Eric D. Petajan. Automatic lipreading to enhance speech recognition , 1984 .
[44] Saeid Sanei,et al. Video assisted speech source separation , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[45] Geoffrey P. Bingham,et al. "Dynamics and the orientation of kinematic forms in visual event recognition": Correction. , 1996 .
[46] Christian Jutten,et al. Further experiments on audio-visual speech source separation , 2003, AVSP.
[47] David Malah,et al. Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..
[48] Gérard Bailly,et al. Seeing tongue movements from outside , 2002, INTERSPEECH.
[49] Gérard Bailly,et al. Audiovisual Speech Synthesis , 2003, Int. J. Speech Technol..
[50] W. H. Sumby,et al. Erratum: Visual Contribution to Speech Intelligibility in Noise [J. Acoust. Soc. Am. 26, 212 (1954)] , 1954 .
[51] Hani Yehia,et al. Quantitative association of vocal-tract and facial behavior , 1998, Speech Commun..
[52] Christian Benoît,et al. Which components of the face do humans and machines best speechread , 1996 .
[53] Peng Liu,et al. Voice activity detection using visual information , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[54] Guillaume Gibert,et al. Analysis and synthesis of the three-dimensional movements of the head, face, and hand of a speaker using cued speech. , 2005, The Journal of the Acoustical Society of America.
[55] Christian Jutten,et al. Developing an audio-visual speech source separation algorithm , 2004, Speech Commun..
[56] Sharon M. Thomas,et al. Contributions of oral and extraoral facial movement to visual and audiovisual speech perception. , 2004, Journal of experimental psychology. Human perception and performance.
[57] L. Rosenblum,et al. An audiovisual test of kinematic primitives for visual speech perception. , 1996, Journal of experimental psychology. Human perception and performance.
[58] Lynne E. Bernstein,et al. Auditory speech detection in noise enhanced by lipreading , 2004, Speech Commun..
[59] Christian Jutten,et al. Visual voice activity detection as a help for speech source separation from convolutive mixtures , 2007, Speech Commun..
[60] N. P. Erber. Auditory-visual perception of speech. , 1975, The Journal of speech and hearing disorders.
[61] H. McGurk,et al. Hearing lips and seeing voices , 1976, Nature.
[62] Chalapathy Neti,et al. Recent advances in the automatic recognition of audiovisual speech , 2003, Proc. IEEE.
[63] Ephraim. Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .