Abstract
Accurate speech recognition can provide a natural interface for human–computer interaction. Recognition rates of the modern speech recognition systems are highly dependent on background noise levels and a choice of acoustic feature extraction method can have a significant impact on system performance. This paper presents a robust speech recognition system based on a front-end motivated by human cochlear processing of audio signals. In the proposed front-end, cochlear behavior is first emulated by the filtering operations of the gammatone filterbank and subsequently by the Inner Hair cell (IHC) processing stage. Experimental results using a continuous density Hidden Markov Model (HMM) recognizer with the proposed Gammatone Hair Cell (GHC) coefficients are lower for clean speech conditions, but demonstrate significant improvement in performance in noisy conditions compared to standard Mel-Frequency Cepstral Coefficients (MFCC) baseline.
Funder
Hrvatska Zaklada za Znanost
Subject
Computer Networks and Communications,Human-Computer Interaction
Reference52 articles.
1. The nature of speech and its interpretation
2. Automatic Recognition of Spoken Digits
3. Speech recognition in noisy environments: A survey
4. Analysis of factors influencing accuracy of speech recognition;Ceidaite;Elektron. Ir Elektrotech.,2010
5. Mobile Multimedia Processing;Tan,2010
Cited by
11 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献