Time-frequency analysis and auditory modeling for automatic recognition of speech
暂无分享,去创建一个
[1] B. Moore. An introduction to the psychology of hearing (5th ed.). , 1989 .
[2] J. Ville. THEORY AND APPLICATION OF THE NOTION OF COMPLEX SIGNAL , 1958 .
[3] W. Koenig,et al. The Sound Spectrograph , 1946 .
[4] Shihab A. Shamma,et al. The acoustic features of speech sounds in a model of auditory processing: vowels and voiceless fricatives , 1988 .
[5] J. Pickles. An Introduction to the Physiology of Hearing , 1982 .
[6] James J. Jenkins,et al. Dynamic specification of coarticulated vowels , 1983 .
[7] H Hermansky,et al. Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.
[8] Stan Davis,et al. Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .
[9] J Harrington,et al. The contribution of the murmur and vowel to the place of articulation distinction in nasal consonants. , 1994, The Journal of the Acoustical Society of America.
[10] J. Allen,et al. Cochlear modeling , 1985, IEEE ASSP Magazine.
[11] Steven Greenberg,et al. A Composite Model of the Auditory Periphery for the Processing of Speech (Invited) , 1988 .
[12] M. Sachs,et al. Representation of steady-state vowels in the temporal aspects of the discharge patterns of populations of auditory-nerve fibers. , 1979, The Journal of the Acoustical Society of America.
[13] G. E. Peterson,et al. Control Methods Used in a Study of the Vowels , 1951 .
[14] A. Gray,et al. Least squares glottal inverse filtering from the acoustic speech waveform , 1979 .
[15] B H Repp,et al. Acoustic properties and perception of stop consonant release transients. , 1989, The Journal of the Acoustical Society of America.
[16] Robert J. Marks,et al. The use of cone-shaped kernels for generalized time-frequency representations of nonstationary signals , 1990, IEEE Trans. Acoust. Speech Signal Process..
[17] J J Jenkins,et al. Vowel identification in mixed-speaker silent-center syllables. , 1994, The Journal of the Acoustical Society of America.
[18] Ted H. Applebaum,et al. Perceptually-based dynamic spectrograms , 1993 .
[19] Harvey F. Silverman,et al. A time-varying analysis method for rapid transitions in speech , 1991, IEEE Trans. Signal Process..
[20] E. Wigner. On the quantum correction for thermodynamic equilibrium , 1932 .
[21] Nobuhiro Miki,et al. Adaptive identification of a time-varying ARMA speech model , 1986, IEEE Trans. Acoust. Speech Signal Process..
[22] L. Carney,et al. A model for the responses of low-frequency auditory-nerve fibers in cat. , 1993, The Journal of the Acoustical Society of America.
[23] L. Cohen. Generalized Phase-Space Distribution Functions , 1966 .
[24] L. A. Westerman,et al. A diffusion model of the transient response of the cochlear inner hair cell synapse. , 1988, The Journal of the Acoustical Society of America.
[25] Li Deng,et al. A generalized hidden Markov model with state-conditioned trend functions of time for the speech signal , 1992, Signal Process..
[26] Oded Ghitza,et al. Auditory nerve representation as a front-end for speech recognition in a noisy environment , 1986 .
[27] Petros Maragos,et al. Energy separation in signal modulations with application to speech analysis , 1993, IEEE Trans. Signal Process..
[28] R. Smits. Accuracy of quasistationary analysis of highly dynamic speech signals , 1994 .
[29] B. Kedem,et al. Spectral analysis and discrimination by zero-crossings , 1986, Proceedings of the IEEE.
[31] William Bialek,et al. Optimal Real-Time Signal Processing in the Nervous System , 1993 .
[32] Shiufun Cheung,et al. Combined multi-resolution (wideband/narrowband) spectrogram , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.
[33] Athanasios Papoulis,et al. Probability, Random Variables and Stochastic Processes , 1965 .
[34] Harvey F. Silverman,et al. Time-varying feature selection and classification of unvoiced stop consonants , 1994, IEEE Trans. Speech Audio Process..
[35] B.-H. Juang,et al. Maximum-likelihood estimation for mixture multivariate stochastic observations of Markov chains , 1985, AT&T Technical Journal.
[36] Patrick J. Loughlin,et al. Advanced time-frequency representations for speech processing , 1993 .
[37] Les E. Atlas,et al. Construction of positive time-frequency distributions , 1994, IEEE Trans. Signal Process..
[38] Mari Ostendorf,et al. ML estimation of a stochastic linear system with the EM algorithm and its application to speech recognition , 1993, IEEE Trans. Speech Audio Process..
[39] E. Bryan George,et al. Co-channel speaker separation , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[40] William J. Byrne,et al. The Auditory Processing and Recognition of Speech , 1989, HLT.
[41] Richard F. Lyon,et al. Computational models of neural auditory processing , 1984, ICASSP.
[42] Francisco Casacuberta,et al. A nonstationary model for the analysis of transient speech signals , 1987, IEEE Trans. Acoust. Speech Signal Process..
[43] Oded Ghitza,et al. Hidden Markov models with templates as non-stationary states: an application to speech recognition , 1993, Comput. Speech Lang..
[44] C D Geisler,et al. Comparison of the responses of auditory nerve fibers to consonant-vowel syllables with predictions from linear models. , 1984, The Journal of the Acoustical Society of America.
[45] J. Eggermont. Peripheral auditory adaptation and fatigue: A model oriented review , 1985, Hearing Research.
[46] Q. Summerfield,et al. Modeling the perception of concurrent vowels: vowels with the same fundamental frequency. , 1989, The Journal of the Acoustical Society of America.
[47] Oded Ghitza,et al. A comparative study of mel cepstra and EIH for phone classification under adverse conditions , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[48] E. F. Velez,et al. Transient analysis of speech signals using the Wigner time-frequency representation , 1989, International Conference on Acoustics, Speech, and Signal Processing,.
[49] Biing-Hwang Juang,et al. Robust utterance verification for connected digits recognition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[50] S. Seneff. A joint synchrony/mean-rate model of auditory processing , 1988 .
[51] H. K. Dunn. Methods of Measuring Vowel Formant Bandwidths , 1961 .
[52] G. Fairbanks,et al. Diphthong formants and their movements. , 1962, Journal of speech and hearing research.
[53] Xiaodong Sun,et al. Speech recognition using hidden Markov models with polynomial regression functions as nonstationary states , 1994, IEEE Trans. Speech Audio Process..
[54] Q. Summerfield,et al. Modeling the perception of concurrent vowels: vowels with different fundamental frequencies. , 1990, The Journal of the Acoustical Society of America.
[55] Les E. Atlas,et al. Bilinear time-frequency representations: new insights and properties , 1993, IEEE Trans. Signal Process..
[56] D Kewley-Port,et al. Time-varying features as correlates of place of articulation in stop consonants. , 1983, The Journal of the Acoustical Society of America.
[57] John E. Markel,et al. Linear Prediction of Speech , 1976, Communication and Cybernetics.
[58] C V Pavlovic,et al. Frequency importance functions for a feature recognition test material. , 1988, The Journal of the Acoustical Society of America.
[59] O. Kakusho,et al. Hierarchical AR model for time varying speech signals , 1982, ICASSP.
[60] Benjamin Kedem,et al. Authors' Reply to Comments on 'Zero-crossing rates of functions of Gaussian processes' , 1991, IEEE Trans. Inf. Theory.
[61] Les Atlas,et al. New stationary techniques for the analysis and display of speech transients , 1990, International Conference on Acoustics, Speech, and Signal Processing.
[62] R Plomp,et al. Objective analysis versus subjective assessment of vowels pronounced by native, non-native, and deaf male speakers of Dutch. , 1993, The Journal of the Acoustical Society of America.
[63] M. Sachs,et al. Encoding of steady-state vowels in the auditory nerve: representation in terms of discharge rate. , 1979, The Journal of the Acoustical Society of America.
[64] Hamid Sheikhzadeh,et al. Speech analysis and recognition using interval statistics generated from a composite auditory model , 1998, IEEE Trans. Speech Audio Process..
[65] S. Blumstein,et al. Invariant cues for place of articulation in stop consonants. , 1978, The Journal of the Acoustical Society of America.
[66] B. Hannaford,et al. Approximating time-frequency density functions via optimal combinations of spectrograms , 1994, IEEE Signal Processing Letters.
[67] A. Liberman,et al. Acoustic Loci and Transitional Cues for Consonants , 1954 .
[68] Gunnar Fant,et al. Acoustic Theory Of Speech Production , 1960 .
[69] R. N. Ohde,et al. Effect of relative amplitude of frication on perception of place of articulation. , 1991, The Journal of the Acoustical Society of America.
[70] R Meddis,et al. Modeling the identification of concurrent vowels with different fundamental frequencies. , 1992, The Journal of the Acoustical Society of America.
[71] Kuansan Wang,et al. Self-normalization and noise-robustness in early auditory representations , 1994, IEEE Trans. Speech Audio Process..
[72] William J. Williams,et al. Improved time-frequency representation of multicomponent signals using exponential kernels , 1989, IEEE Trans. Acoust. Speech Signal Process..
[73] Les E. Atlas,et al. Applications of positive time-frequency distributions to speech processing , 1994, IEEE Trans. Speech Audio Process..
[74] Richard F. Lyon,et al. On the importance of time—a temporal representation of sound , 1993 .
[75] Yves Grenier,et al. Time-dependent ARMA modeling of nonstationary signals , 1983 .
[76] E D Young,et al. Auditory nerve representation of vowels in background noise. , 1983, Journal of neurophysiology.
[77] Les Atlas,et al. Truly nonstationary techniques for the analysis and display of voiced speech , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.
[78] S. Furui. On the role of spectral transition for speech perception. , 1986, The Journal of the Acoustical Society of America.
[79] L. A. Liporace. Linear estimation of nonstationary signals. , 1975, The Journal of the Acoustical Society of America.
[80] E. Young,et al. Auditory-nerve encoding of pinna-based spectral cues: rate representation of high-frequency stimuli. , 1995, The Journal of the Acoustical Society of America.
[81] Sadaoki Furui,et al. Speaker-independent isolated word recognition using dynamic features of speech spectrum , 1986, IEEE Trans. Acoust. Speech Signal Process..
[82] S. Blumstein,et al. Acoustic invariance in speech production: evidence from measurements of the spectral characteristics of stop consonants. , 1979, The Journal of the Acoustical Society of America.
[83] Q Summerfield,et al. Perception of concurrent vowels: effects of harmonic misalignment and pitch-period asynchrony. , 1991, The Journal of the Acoustical Society of America.
[84] John H. L. Hansen,et al. Speech enhancement based on a new set of auditory constrained parameters , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.
[85] Steven Greenberg,et al. Stochastic perceptual models of speech , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[86] Robert J. Safranek,et al. Signal compression based on models of human perception , 1993, Proc. IEEE.
[87] Biing-Hwang Juang,et al. Filtering the time sequence of spectral parameters for speaker-independent CDHMM word recognition , 1995, EUROSPEECH.
[88] Douglas L. Jones,et al. A signal-dependent time-frequency representation: optimal kernel design , 1993, IEEE Trans. Signal Process..
[89] Nathalie Virag. Speech enhancement based on masking properties of the auditory system , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[90] C. Neti,et al. Neuromorphic speech processing for noisy environments , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).
[91] M. Riley. Speech Time-Frequency Representations , 1989 .
[92] Leon Cohen,et al. Instantaneous Frequency, Its Standard Deviation And Multicomponent Signals , 1988, Optics & Photonics.
[93] S. Zahorian,et al. Spectral-shape features versus formants as acoustic correlates for vowels. , 1993, The Journal of the Acoustical Society of America.
[94] H. Garudadri,et al. Invariant acoustic cues in stop consonants: A cross‐language study using the Wignet distribution , 1986 .
[95] P H Milenkovic. Voice source model for continuous control of pitch period. , 1993, The Journal of the Acoustical Society of America.
[96] Douglas A. Reynolds,et al. Measuring fine structure in speech: application to speaker identification , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[97] R. Patterson,et al. Complex Sounds and Auditory Images , 1992 .
[98] L. A. Westerman,et al. Rapid and short-term adaptation in auditory nerve responses , 1984, Hearing Research.
[99] Hynek Hermansky,et al. RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..
[100] S. Shamma. Speech processing in the auditory system. II: Lateral inhibition and the central processing of speech evoked activity in the auditory nerve. , 1985, The Journal of the Acoustical Society of America.
[101] R I Damper,et al. A computational model of afferent neural activity from the cochlea to the dorsal acoustic stria. , 1991, The Journal of the Acoustical Society of America.
[102] P. Lieberman. Perturbations in Vocal Pitch , 1960 .