相关论文

Dynamic programming algorithm optimization for spoken word recognition

Abstract:This paper reports on an optimum dynamic progxamming (DP) based time-normalization algorithm for spoken word recognition. First, a general principle of time-normalization is given using time-warping function. Then, two time-normalized distance definitions, called symmetric and asymmetric forms, are derived from the principle. These two forms are compared with each other through theoretical discussions and experimental studies. The symmetric form algorithm superiority is established. A new technique, called slope constraint, is successfully introduced, in which the warping function slope is restricted so as to improve discrimination between words in different categories. The effective slope constraint characteristic is qualitatively analyzed, and the optimum slope constraint condition is determined through experiments. The optimized algorithm is then extensively subjected to experimental comparison with various DP-algorithms, previously applied to spoken word recognition by different research groups. The experiment shows that the present algorithm gives no more than about two-thirds errors, even compared to the best conventional algorithm.

摘要:提出了一种基于最优动态规划(DP)的语音识别时间归一化算法。首先,利用时间规整函数给出了时间归一化的一般原理。然后,根据该原理导出了两种时间归一化距离定义,称为对称形式和非对称形式。通过理论探讨和实验研究,对这两种形式进行了比较。建立了对称形式的算法优势。成功地引入了一种新的技术--斜率约束,通过限制翘曲函数斜率来提高不同类别词之间的区分能力。定性分析了有效坡度约束特性,并通过试验确定了最佳坡度约束条件。然后,将优化后的算法与不同研究小组以前应用于语音识别的各种DP-算法进行了广泛的实验比较。实验表明,即使与最好的传统算法相比,本算法的误差也不超过三分之二。

引用
Determination of Variation Ranges of the Psola Transformation Parameters by Using Their Influence on the Acoustic Parameters of Speech
2014
Towards optimising modality allocation for multimodal output generation in incremental dialogue
2012
Fall Detection for Mobile Phone based on Movement Pattern
2012
Analytical DP Matching
2007
PID Controller Design for Specified Performance
2012
"Extension du Corps Sonore" - Dancing Viola
NIME
2009
Jarvis, Digital Life Assistant
2013
Deep Learning for Embodied Vision Navigation: A Survey
2108.04097
2021
Robotic Learning of Manipulation Tasks from Visual Perception Using a Kinect Sensor
2014
Similarity Measures and Dimensionality Reduction Techniques for Time Series Data Mining
2012
Wearable Activity Recognition with Crowdsourced Annotation
2016
Speaker identification from extracted features of selective energized voice signal
2018
Data-Driven Analysis and Interpolation of Optical Material Properties
2015
Tslearn, A Machine Learning Toolkit for Time Series Data
J. Mach. Learn. Res.
2020
Spoken Digits Recognition using Weighted MFCC and Improved Features for Dynamic Time Warping
2012
Comparative Analysis of Global Feature Extraction Methods for Off-line Signature Recognition
2012
Review on Retrospective Procedures to Correct Retinal Motion Artefacts in OCT Imaging
Applied Sciences
2019
Machine Learning and Knowledge Discovery in Databases
Lecture Notes in Computer Science
2016
pyts: A Python Package for Time Series Classification
J. Mach. Learn. Res.
2020
Personal Rehabilitation Exercise Assistant with Kinect and Dynamic Time Warping
CIKM 2013
2013