Speaker-Independent Spectral Enhancement for Bone-Conducted Speech
-
Published:2023-03-09
Issue:3
Volume:16
Page:153
-
ISSN:1999-4893
-
Container-title:Algorithms
-
language:en
-
Short-container-title:Algorithms
Author:
Cheng Liangliang1, Dou Yunfeng2, Zhou Jian1, Wang Huabin1, Tao Liang1
Affiliation:
1. School of Computer Science and Technology, Anhui University, Hefei 230601, China 2. Anhui Finance & Trade Vocational College, Hefei 230601, China
Abstract
Because of the acoustic characteristics of bone-conducted (BC) speech, BC speech can be enhanced to better communicate in a complex environment with high noise. Existing BC speech enhancement models have weak spectral recovery capability for the high-frequency part of BC speech and have poor enhancement and robustness for the speaker-independent BC speech datasets. To improve the enhancement effect of BC speech for speaker-independent speech enhancement, we use a GANs method to establish the feature mapping between BC and air-conducted (AC) speech to recover the missing components of BC speech. In addition, the method adds the training of the spectral distance constraint model and, finally, uses the enhanced model completed by the training to reconstruct the BC speech. The experimental results show that this method is superior to the comparison methods such as CycleGAN, BLSTM, GMM, and StarGAN in terms of speaker-independent BC speech enhancement and can obtain higher subjective and objective evaluation results of enhanced BC speech.
Funder
National Natural Science Foundation of China Joint Fund Key Project National Natural Science Foundation of China Natural Science Foundation of Anhui Province Key Projects of Natural Science Foundation of Anhui Province Universities
Subject
Computational Mathematics,Computational Theory and Mathematics,Numerical Analysis,Theoretical Computer Science
Reference41 articles.
1. Speech Enhancement using Spectral Subtraction-type Algorithms: A Comparison and Simulation Study;Upadhyay;Procedia Comput. Sci.,2015 2. Duong, H., Nguyen, Q.C., Nguyen, C., Tran, T., and Duong, N.Q. (2015, January 3–4). Speech enhancement based on nonnegative matrix factorization with mixed group sparsity constraint. Proceedings of the Sixth International Symposium on Information and Communication Technology, Hue, Vietnam. 3. Wang, K., He, B., and Zhu, W. (2021, January 6–11). TSTNN: Two-Stage Transformer Based Neural Network for Speech Enhancement in the Time Domain. Proceedings of the ICASSP 2021—2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada. 4. Hu, Y., Liu, Y., Lv, S., Xing, M., Zhang, S., Fu, Y., Wu, J., Zhang, B., and Xie, L. (2020). DCCRN: Deep Complex Convolution Recurrent Network for Phase-Aware Speech Enhancement. arXiv. 5. Shin, H.S., Kang, H.G., and Fingscheidt, T. (2012, January 26–28). Survey of Speech Enhancement Supported by a Bone Conduction Microphone. Proceedings of the Speech Communication; 10. ITG Symposium; Proceedings of VDE, Braunschweig, Germany.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|