论文信息 - Analysis of the sensitivity of the End-Of-Turn Detection task to errors generated by the Automatic Speech Recognition process - 字舞流文

Analysis of the sensitivity of the End-Of-Turn Detection task to errors generated by the Automatic Speech Recognition process

Roberto Santana | José Antonio Lozano | César Montenegro

[1] Xiaoying Liu,et al. Sentence Similarity based on Dynamic Time Warping , 2007, International Conference on Semantic Computing (ICSC 2007).

[2] Waleed H. Abdulla,et al. Cross-words reference template for DTW-based speech recognition systems , 2003, TENCON 2003. Conference on Convergent Technologies for Asia-Pacific Region.

[3] Emily Mower Provost,et al. Improving End-of-Turn Detection in Spoken Dialogues by Detecting Speaker Intentions as a Secondary Task , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4] Philip Chan,et al. Toward accurate dynamic time warping in linear time and space , 2007, Intell. Data Anal..

[5] Marco Matassoni,et al. Noise-tolerant speech recognition: the SNN-TA approach , 2003, Inf. Sci..

[6] Vladimir I. Levenshtein,et al. Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[7] Liang Tao,et al. Unsupervised learning of phonemes of whispered speech in a noisy environment based on convolutive non-negative matrix factorization , 2014, Inf. Sci..

[8] John R. Hershey,et al. Hybrid CTC/Attention Architecture for End-to-End Speech Recognition , 2017, IEEE Journal of Selected Topics in Signal Processing.

[9] Francesco Piazza,et al. Environmental robust speech and speaker recognition through multi-channel histogram equalization , 2012, Neurocomputing.

[10] Julia Hirschberg,et al. Turn-taking cues in task-oriented dialogue , 2011, Comput. Speech Lang..

[11] Chip-Hong Chang,et al. Bayesian Separation With Sparsity Promotion in Perceptual Wavelet Domain for Speech Enhancement and Hybrid Speech Recognition , 2011, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[12] Tara N. Sainath,et al. State-of-the-Art Speech Recognition with Sequence-to-Sequence Models , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[14] Stan Davis,et al. Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[15] Seyed Reza Shahamiri,et al. Real-time frequency-based noise-robust Automatic Speech Recognition using Multi-Nets Artificial Neural Networks: A multi-views multi-learners approach , 2014, Neurocomputing.

[16] Hiroshi Ishiguro,et al. Turn-Taking Estimation Model Based on Joint Embedding of Lexical and Prosodic Contents , 2017, INTERSPEECH.

[17] Abeer Alwan,et al. A Low-Complexity Parabolic Lip Contour Model With Speaker Normalization for High-Level Feature Extraction in Noise-Robust Audiovisual Speech Recognition , 2008, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[18] Kristin Precoda,et al. Speech Recognition Engineering Issues in Speech to Speech Translation System Design for Low Resource Languages and Domains , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[19] John J. Godfrey,et al. SWITCHBOARD: telephone speech corpus for research and development , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[20] Abderrahmane Amrouche,et al. An efficient speech recognition system in adverse conditions using the nonparametric regression , 2010, Eng. Appl. Artif. Intell..

[21] Ascensión Gallardo-Antolín,et al. An attention Long Short-Term Memory based system for automatic classification of speech intelligibility , 2020, Eng. Appl. Artif. Intell..

[22] Visar Berisha,et al. Investigating the Effects of Word Substitution Errors on Sentence Embeddings , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[23] Frédéric Alexandre,et al. Spatio-temporal biologically inspired models for clean and noisy speech recognition , 2007, Neurocomputing.

[24] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.