Affiliation:
1. Gauhati University, India
Abstract
Acoustic modeling of the sound unit is a crucial component of Automatic Speech Recognition (ASR) system. This is the process of establishing statistical representations for the feature vector sequences for a particular sound unit so that a classifier for the entire sound unit used in the ASR system can be designed. Current ASR systems use Hidden Markov Model (HMM) to deal with temporal variability and Gaussian Mixture Model (GMM) for acoustic modeling. Recently machine learning paradigms have been explored for application in speech recognition domain. In this regard, Multi Layer Perception (MLP), Recurrent Neural Network (RNN) etc. are extensively used. Artificial Neural Network (ANN)s are trained by back propagating the error derivatives and therefore have the potential to learn much better models of nonlinear data. Recently, Deep Neural Network (DNN)s with many hidden layer have been up voted by the researchers and have been accepted to be suitable for speech signal modeling. In this chapter various techniques and works on the ANN based acoustic modeling are described.
Reference53 articles.
1. Sub-band-based speech recognition.;H.Bourlard;Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing,1997
2. Connectionist Speech Recognition
3. Brown, P. A. (1987). The Acoustic Modeling Problem in Automatic Speech Recognition [Doctoral dissertation]. School of Computer Science at Carnegie Mellon University.
4. Dahl G. E., Yu D, Deng Li & Acero A. (2012). Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition. IEEE Transactions on Audio, Speech, and Language Processing. 20(1), 30-42.
5. Maximum likelihood from incomplete data via the EM algorithm.;A. P.Dempster;Journal of the Royal Statistical Society. Series B. Methodological,1977