A speech enhancement algorithm based on a non-negative hidden Markov model and Kullback-Leibler divergence-Reference-Cited by-同舟云学术

A speech enhancement algorithm based on a non-negative hidden Markov model and Kullback-Leibler divergence

Published:2022-09-08 Issue:1 Volume:2022 Page:
ISSN:1687-4722
Container-title:EURASIP Journal on Audio, Speech, and Music Processing
language:en
Short-container-title:J AUDIO SPEECH MUSIC PROC.

Author:

Xiang Yang^ORCID,Shi Liming,Højvang Jesper Lisby,Rasmussen Morten Højfeldt,Christensen Mads Græsbøll

Abstract

AbstractIn this paper, we propose a supervised single-channel speech enhancement method that combines Kullback-Leibler (KL) divergence-based non-negative matrix factorization (NMF) and a hidden Markov model (NMF-HMM). With the integration of the HMM, the temporal dynamics information of speech signals can be taken into account. This method includes a training stage and an enhancement stage. In the training stage, the sum of the Poisson distribution, leading to the KL divergence measure, is used as the observation model for each state of the HMM. This ensures that a computationally efficient multiplicative update can be used for the parameter update of this model. In the online enhancement stage, a novel minimum mean square error estimator is proposed for the NMF-HMM. This estimator can be implemented using parallel computing, reducing the time complexity. Moreover, compared to the traditional NMF-based speech enhancement methods, the experimental results show that our proposed algorithm improved the short-time objective intelligibility and perceptual evaluation of speech quality by 5% and 0.18, respectively.

Funder

Innovationsfonden

Publisher

Springer Science and Business Media LLC

Subject

Electrical and Electronic Engineering,Acoustics and Ultrasonics

Link

https://link.springer.com/content/pdf/10.1186/s13636-022-00256-5.pdf

Reference65 articles.

1. J. Li, L. Deng, Y. Gong, R. Haeb-Umbach, An overview of noise-robust automatic speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 22(4), 745–777 (2014)

2. P.C. Loizou, Speech Enhancement: Theory and Practice (CRC Press, Boca Raton, 2013)

3. Y. Xu, J. Du, L.-R. Dai, C.-H. Lee, An experimental study on speech enhancement based on deep neural networks. IEEE Signal Process. Lett. 21(1), 65–68 (2013)

4. I. Cohen, S. Gannot, in Springer Handbook of Speech Processing. Spectral enhancement methods (Springer, Berlin, Heidelberg, 2008) p. 873–902

5. S. Boll, Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. Acoust. Speech Signal Process. 27(2), 113–120 (1979)

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Unsupervised framework for single channel heart and lung sounds separation in data constrained environments;Applied Acoustics;2024-07