Abstract
AbstractDespite excellent performance in quiet, cochlear implants (CIs) only partially restore normal levels of intelligibility in noisy settings. Recent developments in machine learning have resulted in deep neural network (DNN) models that achieve noteworthy performance in speech enhancement and separation tasks. However, there are no commercially available CI audio processors that utilize DNN models for noise reduction. We implemented two DNN models intended for applications in CIs: (1) a recurrent neural network (RNN), which is a lightweight template model, and (2) SepFormer, which is the current top-performing speech separation model in the literature. The models were trained with a custom training dataset (30 hours) that included four configurations: speech in non-speech noise and speech in 1-talker, 2-talker, and 4-talker speech babble backgrounds. The enhancement of the target speech (or the suppression of the noise) by the models was evaluated by commonly used acoustic evaluation metrics of quality and intelligibility, including (1) signal-to-distortion ratio, (2) “perceptual” evaluation of speech quality, and (3) short-time objective intelligibility. Both DNN models yielded significant improvements in all acoustic metrics tested. The two DNN models were also evaluated with thirteen CI users using two types of background noise: (1) CCITT noise (speech-shaped stationary noise) and (2) 2-talker babble. Significant improvements in speech intelligibility were observed when the noisy speech was processed by the models, compared to the unprocessed conditions. This work serves as a proof of concept for the application of DNN technology in CIs for improved listening experience and speech comprehension in noisy environments.
Publisher
Cold Spring Harbor Laboratory
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献