Speech Recognition Experiments with Perceptrons

Artificial neural networks (ANNs) are capable of accurate recognition of simple speech vocabularies such as isolated digits [1]. This paper looks at two more difficult vocabularies, the alphabetic E-set and a set of polysyllabic words. The E-set is difficult because it contains weak discriminants and polysyllables are difficult because of timing variation. Polysyllabic word recognition is aided by a time pre-alignment technique based on dynamic programming and E-set recognition is improved by focusing attention. Recognition accuracies are better than 98% for both vocabularies when implemented with a single layer perceptron.