Experimental Evaluation of Deep Learning Methods for an Intelligent Pathological Voice Detection System Using the Saarbruecken Voice Database-Reference-Cited by-同舟云学术

Experimental Evaluation of Deep Learning Methods for an Intelligent Pathological Voice Detection System Using the Saarbruecken Voice Database

Published:2021-08-02 Issue:15 Volume:11 Page:7149
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Lee Ji-Yeoun

Abstract

This work is focused on deep learning methods, such as feedforward neural network (FNN) and convolutional neural network (CNN), for pathological voice detection using mel-frequency cepstral coefficients (MFCCs), linear prediction cepstrum coefficients (LPCCs), and higher-order statistics (HOSs) parameters. In total, 518 voice data samples were obtained from the publicly available Saarbruecken voice database (SVD), comprising recordings of 259 healthy and 259 pathological women and men, respectively, and using /a/, /i/, and /u/ vowels at normal pitch. Significant differences were observed between the normal and the pathological voice signals for normalized skewness (p = 0.000) and kurtosis (p = 0.000), except for normalized kurtosis (p = 0.051) that was estimated in the /u/ samples in women. These parameters are useful and meaningful for classifying pathological voice signals. The highest accuracy, 82.69%, was achieved by the CNN classifier with the LPCCs parameter in the /u/ vowel in men. The second-best performance, 80.77%, was obtained with a combination of the FNN classifier, MFCCs, and HOSs for the /i/ vowel samples in women. There was merit in combining the acoustic measures with HOS parameters for better characterization in terms of accuracy. The combination of various parameters and deep learning methods was also useful for distinguishing normal from pathological voices.

Funder

National Research Foundation of Korea

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/11/15/7149/pdf

Reference37 articles.

1. Pathological Voice Detection Using Efficient Combination of Heterogeneous Features

2. Objective Pathological Voice Quality Assessment Based on HOS Features

3. Automatic Assessment of Pathological Voice Quality Using Higher-Order Statistics in the LPC Residual Domain

4. Discrimination Between Pathological and Normal Voices Using GMM-SVM Approach

5. Towards Secured Online Monitoring for Digitalized GIS Against Cyber-Attacks Based on IoT and Machine Learning

Cited by 24 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Voice pathology detection on spontaneous speech data using deep learning models;International Journal of Speech Technology;2024-08-10

2. Pathological voice classification using MEEL features and SVM-TabNet model;Speech Communication;2024-07

3. Identification of Smith–Magenis syndrome cases through an experimental evaluation of machine learning methods;Frontiers in Computational Neuroscience;2024-03-22

4. Speech Disorders Analysis Using a Line of Narrow-Band Filters;2024 6th International Youth Conference on Radio Electronics, Electrical and Power Engineering (REEPE);2024-02-29

5. Time-Frequency Scattergrams for Biomedical Audio Signal Representation and Classification;IEEE/ACM Transactions on Audio, Speech, and Language Processing;2024