论文信息 - The shared views of four research groups ) - 字舞流文

The shared views of four research groups )

Tara N. Sainath | Brian Kingsbury | Geoffrey E. Hinton | Dong Yu | Li Deng | Navdeep Jaitly | Patrick Nguyen | Abdel-rahman Mohamed | Vincent Vanhoucke | George E. Dahl | Andrew W. Senior | Vincent Vanhoucke | A. Senior | Abdel-rahman Mohamed | Dong Yu | Navdeep Jaitly | T. Sainath | Brian Kingsbury | Liya Deng | Patrick Nguyen | N. Jaitly

[1] Li Deng,et al. Computational Models for Speech Production , 2018, Speech Processing.

[2] Dong Yu,et al. Conversational Speech Transcription Using Context-Dependent Deep Neural Networks , 2012, ICML.

[3] Tara N. Sainath,et al. Auto-encoder bottleneck features using deep belief networks , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4] Geoffrey E. Hinton,et al. Understanding how Deep Belief Networks perform acoustic modelling , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5] Gerald Penn,et al. Applying Convolutional Neural Networks concepts to hybrid NN-HMM model for speech recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6] Chin-Hui Lee,et al. Boosting attribute and phone estimation accuracies with deep neural networks for detection-based speech recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7] Tara N. Sainath,et al. Improved pre-training of Deep Belief Networks using Sparse Encoding Symmetric Machines , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8] Dong Yu,et al. A deep architecture with bilinear modeling of hidden representations: Applications to phonetic recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9] Heiga Zen,et al. Product of Experts for Statistical Parametric Speech Synthesis , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[10] Dong Yu,et al. Scalable stacking and learning for building deep architectures , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[11] Hynek Hermansky,et al. Sparse Multilayer Perceptron for Phoneme Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[12] Nelson Morgan,et al. Deep and Wide: Multiple Layers in Automatic Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[13] Navdeep Jaitly,et al. Application of Pretrained Deep Neural Networks to Large Vocabulary Conversational Speech Recognition , 2012 .

[14] Geoffrey E. Hinton,et al. Acoustic Modeling Using Deep Belief Networks , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[15] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[16] Geoffrey E. Hinton. A Practical Guide to Training Restricted Boltzmann Machines , 2012, Neural Networks: Tricks of the Trade.

[17] Dong Yu,et al. Feature engineering in Context-Dependent Deep Neural Networks for conversational speech transcription , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.

[18] Tara N. Sainath,et al. Exemplar-Based Sparse Representation Features: From TIMIT to LVCSR , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[19] Dong Yu,et al. Deep Convex Net: A Scalable Architecture for Speech Pattern Classification , 2011, INTERSPEECH.

[20] Pascal Vincent,et al. Contractive Auto-Encoders: Explicit Invariance During Feature Extraction , 2011, ICML.

[21] Quoc V. Le,et al. On optimization methods for deep learning , 2011, ICML.

[22] Geoffrey Zweig,et al. Speech recognitionwith segmental conditional random fields: A summary of the JHU CLSP 2010 Summer Workshop , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[23] Tara N. Sainath,et al. Deep Belief Networks using discriminative features for phone recognition , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[24] Oriol Vinyals,et al. Comparing multilayer perceptron to Deep Belief Network Tandem features for robust ASR , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[25] Vincent Vanhoucke,et al. Improving the speed of neural networks on CPUs , 2011 .

[26] Geoffrey E. Hinton,et al. Phone Recognition with the Mean-Covariance Restricted Boltzmann Machine , 2010, NIPS.

[27] Dong Yu,et al. Roles of Pre-Training and Fine-Tuning in Context-Dependent DBN-HMMs for Real-World Speech Recognition , 2010 .

[28] Dong Yu,et al. Investigation of full-sequence training of deep belief networks for speech recognition , 2010, INTERSPEECH.

[29] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.

[30] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[31] Eric Fosler-Lussier,et al. Backpropagation training for multilayer conditional random field based phone recognition , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[32] Pascal Vincent,et al. Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[33] Luca Maria Gambardella,et al. Deep, Big, Simple Neural Nets for Handwritten Digit Recognition , 2010, Neural Computation.

[34] Honglak Lee,et al. Unsupervised feature learning for audio classification using convolutional deep belief networks , 2009, NIPS.

[35] Tara N. Sainath,et al. An exploration of large vocabulary tools for small vocabulary phonetic recognition , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.

[36] Brian Kingsbury,et al. Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[37] James R. Glass,et al. Developments and directions in speech recognition and understanding, Part 1 [DSP Education] , 2009, IEEE Signal Processing Magazine.

[38] Steve Renals,et al. Speech Recognition Using Augmented Conditional Random Fields , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[39] Brian Kingsbury,et al. Boosted MMI for model and feature-space discriminative training , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[40] Yoshua Bengio,et al. An empirical evaluation of deep architectures on problems with many factors of variation , 2007, ICML '07.

[41] Jan Cernocký,et al. Probabilistic and Bottle-Neck Features for LVCSR of Meetings , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[42] Dong Yu,et al. Use of Differential Cepstra as Acoustic Features in Hidden Trajectory Modeling for Phonetic Recognition , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[43] Dong Yu,et al. Structured speech modeling , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[44] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[45] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[46] N. Morgan,et al. Pushing the envelope - aside [speech recognition] , 2005, IEEE Signal Processing Magazine.

[47] Li Deng,et al. Switching Dynamic System Models for Speech Articulation and Acoustics , 2004 .

[48] Geoffrey E. Hinton. Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[49] Li Deng,et al. An overlapping-feature-based phonological model incorporating linguistic constraints: applications to speech recognition. , 2002, The Journal of the Acoustical Society of America.

[50] Daniel Povey,et al. Large scale discriminative training of hidden Markov models for speech recognition , 2002, Comput. Speech Lang..

[51] Daniel P. W. Ellis,et al. Tandem connectionist feature extraction for conventional HMM systems , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[52] James R. Glass,et al. Heterogeneous measurements and multiple classifiers for speech recognition , 1998, ICSLP.

[53] Francis Jack Smith,et al. Improved phone recognition using Bayesian triphone models , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[54] Hermann Ney,et al. A word graph algorithm for large vocabulary continuous speech recognition , 1994, Comput. Speech Lang..

[55] Hervé Bourlard,et al. Connectionist Speech Recognition: A Hybrid Approach , 1993 .

[56] Li Deng,et al. Speech recognition using the atomic speech units constructed from overlapping articulatory features , 1994, EUROSPEECH.

[57] Yoshua Bengio,et al. Global optimization of a neural network-hidden Markov model hybrid , 1992, IEEE Trans. Neural Networks.

[58] H Hermansky,et al. Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[59] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[60] Sadaoki Furui,et al. Digital Speech Processing, Synthesis, and Recognition , 1989 .

[61] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.

[62] Lalit R. Bahl,et al. Maximum mutual information estimation of hidden Markov model parameters for speech recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[63] Biing-Hwang Juang,et al. Maximum likelihood estimation for multivariate mixture observations of markov chains , 1986, IEEE Trans. Inf. Theory.

[64] S. Furui,et al. Cepstral analysis technique for automatic speaker verification , 1981 .