Abstract
This paper proposes a technique for improving statistical-model-based voice activity detection (VAD) in noisy environments to be applied in an auditory hearing aid. The proposed method is implemented for a uniform polyphase discrete Fourier transform filter bank satisfying an auditory device time latency of 8 ms. The proposed VAD technique provides an online unified framework to overcome the frequent false rejection of the statistical-model-based likelihood-ratio test (LRT) in noisy environments. The method is based on the observation that the sparseness of speech and background noise cause high false-rejection error rates in statistical LRT-based VAD—the false rejection rate increases as the sparseness increases. We demonstrate that the false-rejection error rate can be reduced by incorporating likelihood-ratio order statistics into a conventional LRT VAD. We confirm experimentally that the proposed method relatively reduces the average detection error rate by 15.8% compared to a conventional VAD with only minimal change in the false acceptance probability for three different noise conditions whose signal-to-noise ratio ranges from 0 to 20 dB.
Funder
National Research Foundation of Korea
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference29 articles.
1. ITU-T Recommendation G.729 Annex B: a silence compression scheme for use with G.729 optimized for V.70 digital simultaneous voice and data applications
2. Speech Enhancement;Benesty,2005
3. Speech Processing, Transmission and Quality Aspects (STQ); Distributed Speech Recognition; Advanced Front–End Feature Extraction Algorithm; Compression Algorithms,2002
4. The master hearing aid: A PC based platform for algorithm development and evaluation;Grimm;Acta Acust. United Acust.,2006
5. MarkeTrak V
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献