Lipreading with long short-term memory
暂无分享,去创建一个
[1] H. McGurk,et al. Hearing lips and seeing voices , 1976, Nature.
[2] Eric David Petajan,et al. Automatic Lipreading to Enhance Speech Recognition (Speech Reading) , 1984 .
[3] Noboru Sugie,et al. A Speech Prosthesis Employing a Speech Synthesizer-Vowel Discrimination from Perioral Muscle Activities and Vowel Production , 1985, IEEE Transactions on Biomedical Engineering.
[4] M.S. Morse,et al. Use of myoelectric signals to recognize speech , 1989, Images of the Twenty-First Century. Proceedings of the Annual International Engineering in Medicine and Biology Society,.
[5] Lawrence D. Jackel,et al. Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.
[6] Hervé Bourlard,et al. Connectionist speech recognition , 1993 .
[7] Hervé Bourlard,et al. Connectionist Speech Recognition: A Hybrid Approach , 1993 .
[8] Yochai Konig,et al. "Eigenlips" for robust speech recognition , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.
[9] Jenq-Neng Hwang,et al. Lipreading from color video , 1997, IEEE Trans. Image Process..
[10] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[11] Timothy F. Cootes,et al. Active Appearance Models , 1998, ECCV.
[12] Jürgen Schmidhuber,et al. Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.
[13] Yoshua Bengio,et al. Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .
[14] Bruce Denby,et al. Speech synthesis from real time ultrasound images of the tongue , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[15] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[16] Jon Barker,et al. An audio-visual corpus for speech perception and automatic speech recognition. , 2006, The Journal of the Acoustical Society of America.
[17] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[18] Gérard Chollet,et al. Continuous-speech phone recognition from ultrasound and optical images of the tongue and lips , 2007, INTERSPEECH.
[19] Gérard Chollet,et al. Eigentongue Feature Extraction for an Ultrasound-Based Silent Speech Interface , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[20] J. M. Gilbert,et al. Development of a (silent) speech recognition system for patients following laryngectomy. , 2008, Medical engineering & physics.
[21] Matti Pietikäinen,et al. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON MULTIMEDIA 1 Lipreading with Local Spatiotemporal Descriptors , 2022 .
[22] Barry-John Theobald,et al. Comparing visual features for lipreading , 2009, AVSP.
[23] Gérard Chollet,et al. Development of a silent speech interface driven by ultrasound and optical images of the tongue and lips , 2010, Speech Commun..
[24] J. M. Gilbert,et al. Silent speech interfaces , 2010, Speech Commun..
[25] Andrea Vedaldi,et al. Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.
[26] Tanja Schultz,et al. Modeling coarticulation in EMG-based continuous speech recognition , 2010, Speech Commun..
[27] Naomi Harte,et al. Viseme definitions comparison for visual-only speech recognition , 2011, 2011 19th European Signal Processing Conference.
[28] Jürgen Schmidhuber,et al. A committee of neural networks for traffic sign classification , 2011, The 2011 International Joint Conference on Neural Networks.
[29] Tara N. Sainath,et al. FUNDAMENTAL TECHNOLOGIES IN MODERN SPEECH RECOGNITION Digital Object Identifier 10.1109/MSP.2012.2205597 , 2012 .
[30] Barry-John Theobald,et al. Is automated conversion of video to text a reality? , 2012, Optics/Photonics in Security and Defence.
[31] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .
[32] Navdeep Jaitly,et al. Hybrid speech recognition with Deep Bidirectional LSTM , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[33] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[34] Barry-John Theobald,et al. Recent developments in automated lip-reading , 2013, Optics/Photonics in Security and Defence.
[35] Navdeep Jaitly,et al. Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.
[36] Tetsuya Ogata,et al. Lipreading using convolutional neural network , 2014, INTERSPEECH.
[37] Erich Elsen,et al. Deep Speech: Scaling up end-to-end speech recognition , 2014, ArXiv.
[38] Carlos Busso,et al. Lipreading approach for isolated digits recognition under whisper and neutral speech , 2014, INTERSPEECH.
[39] Tanja Schultz,et al. Tackling Speaking Mode Varieties in EMG-Based Speech Recognition , 2014, IEEE Transactions on Biomedical Engineering.
[40] Jürgen Schmidhuber,et al. LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.