Author:
Sidi Yakoub Mohammed,Selouani Sid-ahmed,Zaidi Brahim-Fares,Bouchair Asma
Abstract
AbstractIn this paper, we use empirical mode decomposition and Hurst-based mode selection (EMDH) along with deep learning architecture using a convolutional neural network (CNN) to improve the recognition of dysarthric speech. The EMDH speech enhancement technique is used as a preprocessing step to improve the quality of dysarthric speech. Then, the Mel-frequency cepstral coefficients are extracted from the speech processed by EMDH to be used as input features to a CNN-based recognizer. The effectiveness of the proposed EMDH-CNN approach is demonstrated by the results obtained on the Nemours corpus of dysarthric speech. Compared to baseline systems that use Hidden Markov with Gaussian Mixture Models (HMM-GMMs) and a CNN without an enhancement module, the EMDH-CNN system increases the overall accuracy by 20.72% and 9.95%, respectively, using ak-fold cross-validation experimental setup.
Publisher
Springer Science and Business Media LLC
Subject
Electrical and Electronic Engineering,Acoustics and Ultrasonics
Reference20 articles.
1. P. Enderby, in Handbook of Clinical Neurology (110 ed.)Disorders of communication: Dysarthria (Elsevier B. V., 2013), pp. 273–281. https://www.sciencedirect.com/science/article/pii/B9780444529015000228. https://doi.org/10.1016/B978-0-444-52901-5.00022-8.
2. P. D. Polur, G. E. Miller, Investigation of an hmm/ann hybrid structure in pattern recognition application using cepstral analysis of dysarthric (distorted) speech signals. Med. Eng. Phys.28(8), 741–748 (2006).
3. M. Hasegawa-Johnson, J. Gunderson, A. Perlman, T. Huang, in 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, Toulouse. Hmm-Based and Svm-Based Recognition of the Speech of Talkers With Spastic Dysarthria, (2006), pp. III-III. https://ieeexplore.ieee.org/abstract/document/1660840. https://doi.org/10.1109/ICASSP.2006.1660840.
4. M. J. Kim, B. Cao, K. An, J. Wang, in Interspeech. Dysarthric speech recognition using convolutional lstm neural network, (2018), pp. 2948–2952. https://www.researchgate.net/publication/327350843_Dysarthric_Speech_Recognition_Using_Convolutional_LSTM_Neural_Network.
5. S. Young, G. Evermann, M. Gales, T. Hain, D. Kershaw, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, et al., The htk book (for htk version. 3.3), Cambridge University Engineering Department, 2005 (2006). http://htk.eng.cam.ac.uk/docs/docs.shtml.
Cited by
33 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献