Investigations on End- to-End Audiovisual Fusion
暂无分享,去创建一个
[1] Gregory J. Wolff,et al. Neural network lipreading system for improved speech recognition , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.
[2] J. M. Gilbert,et al. Silent speech interfaces , 2010, Speech Commun..
[3] Joon Son Chung,et al. Lip Reading in the Wild , 2016, ACCV.
[4] Xavier Serra,et al. Freesound technical demo , 2013, ACM Multimedia.
[5] Jürgen Schmidhuber,et al. Lipreading with long short-term memory , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Tanja Schultz,et al. Biosignal-Based Spoken Communication: A Survey , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[7] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[8] Jon Barker,et al. An audio-visual corpus for speech perception and automatic speech recognition. , 2006, The Journal of the Acoustical Society of America.
[9] Eric David Petajan,et al. Automatic Lipreading to Enhance Speech Recognition (Speech Reading) , 1984 .
[10] Satoshi Tamura,et al. Integration of deep bottleneck features for audio-visual speech recognition , 2015, INTERSPEECH.
[11] Davis E. King,et al. Dlib-ml: A Machine Learning Toolkit , 2009, J. Mach. Learn. Res..
[12] Ning Ma,et al. Improving audio-visual speech recognition using deep neural networks with dynamic stream reliability estimates , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] Tetsuya Ogata,et al. Lipreading using convolutional neural network , 2014, INTERSPEECH.
[14] Andrew Zisserman,et al. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.
[15] Juhan Nam,et al. Multimodal Deep Learning , 2011, ICML.
[16] Dorothea Kolossa,et al. Learning Dynamic Stream Weights For Coupled-HMM-Based Audio-Visual Speech Recognition , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[17] Björn W. Schuller,et al. Recent developments in openSMILE, the munich open-source multimedia feature extractor , 2013, ACM Multimedia.
[18] Robert M. Nickel,et al. Dynamic Stream Weighting for Turbo-Decoding-Based Audiovisual ASR , 2016, INTERSPEECH.
[19] Jing Huang,et al. Audio-visual deep learning for noise robust speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[20] Matti Pietikäinen,et al. A review of recent advances in visual speech decoding , 2014, Image Vis. Comput..
[21] Yochai Konig,et al. "Eigenlips" for robust speech recognition , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.
[22] Chalapathy Neti,et al. Recent advances in the automatic recognition of audiovisual speech , 2003, Proc. IEEE.
[23] Jürgen Schmidhuber,et al. Improving Speaker-Independent Lipreading with Domain-Adversarial Training , 2017, INTERSPEECH.
[24] Petr Motlícek,et al. A Large-Scale Open-Source Acoustic Simulator for Speaker Recognition , 2016, IEEE Signal Processing Letters.