Analysis of the sensitivity of the End-Of-Turn Detection task to errors generated by the Automatic Speech Recognition process

[1]  Xiaoying Liu,et al.  Sentence Similarity based on Dynamic Time Warping , 2007, International Conference on Semantic Computing (ICSC 2007).

[2]  Waleed H. Abdulla,et al.  Cross-words reference template for DTW-based speech recognition systems , 2003, TENCON 2003. Conference on Convergent Technologies for Asia-Pacific Region.

[3]  Emily Mower Provost,et al.  Improving End-of-Turn Detection in Spoken Dialogues by Detecting Speaker Intentions as a Secondary Task , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4]  Philip Chan,et al.  Toward accurate dynamic time warping in linear time and space , 2007, Intell. Data Anal..

[5]  Marco Matassoni,et al.  Noise-tolerant speech recognition: the SNN-TA approach , 2003, Inf. Sci..

[6]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[7]  Liang Tao,et al.  Unsupervised learning of phonemes of whispered speech in a noisy environment based on convolutive non-negative matrix factorization , 2014, Inf. Sci..

[8]  John R. Hershey,et al.  Hybrid CTC/Attention Architecture for End-to-End Speech Recognition , 2017, IEEE Journal of Selected Topics in Signal Processing.

[9]  Francesco Piazza,et al.  Environmental robust speech and speaker recognition through multi-channel histogram equalization , 2012, Neurocomputing.

[10]  Julia Hirschberg,et al.  Turn-taking cues in task-oriented dialogue , 2011, Comput. Speech Lang..

[11]  Chip-Hong Chang,et al.  Bayesian Separation With Sparsity Promotion in Perceptual Wavelet Domain for Speech Enhancement and Hybrid Speech Recognition , 2011, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[12]  Tara N. Sainath,et al.  State-of-the-Art Speech Recognition with Sequence-to-Sequence Models , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[14]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[15]  Seyed Reza Shahamiri,et al.  Real-time frequency-based noise-robust Automatic Speech Recognition using Multi-Nets Artificial Neural Networks: A multi-views multi-learners approach , 2014, Neurocomputing.

[16]  Hiroshi Ishiguro,et al.  Turn-Taking Estimation Model Based on Joint Embedding of Lexical and Prosodic Contents , 2017, INTERSPEECH.

[17]  Abeer Alwan,et al.  A Low-Complexity Parabolic Lip Contour Model With Speaker Normalization for High-Level Feature Extraction in Noise-Robust Audiovisual Speech Recognition , 2008, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[18]  Kristin Precoda,et al.  Speech Recognition Engineering Issues in Speech to Speech Translation System Design for Low Resource Languages and Domains , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[19]  John J. Godfrey,et al.  SWITCHBOARD: telephone speech corpus for research and development , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[20]  Abderrahmane Amrouche,et al.  An efficient speech recognition system in adverse conditions using the nonparametric regression , 2010, Eng. Appl. Artif. Intell..

[21]  Ascensión Gallardo-Antolín,et al.  An attention Long Short-Term Memory based system for automatic classification of speech intelligibility , 2020, Eng. Appl. Artif. Intell..

[22]  Visar Berisha,et al.  Investigating the Effects of Word Substitution Errors on Sentence Embeddings , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[23]  Frédéric Alexandre,et al.  Spatio-temporal biologically inspired models for clean and noisy speech recognition , 2007, Neurocomputing.

[24]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.