Nonlinear Dynamic Feature Extraction Based on Phase Space Reconstruction for the Classification of Speech and Emotion-Reference-Cited by-同舟云学术

Nonlinear Dynamic Feature Extraction Based on Phase Space Reconstruction for the Classification of Speech and Emotion

Published:2020-04-09 Issue: Volume:2020 Page:1-15
ISSN:1024-123X
Container-title:Mathematical Problems in Engineering
language:en
Short-container-title:Mathematical Problems in Engineering

Author:

Sun Ying¹,Zhang Xue-Ying¹^ORCID,Ma Jiang-He¹,Song Chun-Xiao¹,Lv Hui-Fen¹

Affiliation:

1. College of Information Engineering, Taiyuan University of Technology, Shanxi Province Jinzhong CityYuci District College Town, Taiyuan030600, China

Abstract

Due to the shortcomings of linear feature parameters in speech signals, and the limitations of existing time- and frequency-domain attribute features in characterizing the integrity of the speech information, in this paper, we propose a nonlinear method for feature extraction based on the phase space reconstruction (PSR) theory. First, the speech signal was analyzed using a nonlinear dynamic model. Then, the model was used to reconstruct a one-dimensional time speech signal. Finally, nonlinear dynamic (NLD) features based on the reconstruction of the phase space were extracted as the new characteristic parameters. Then, the performance of NLD features was verified by comparing their recognition rates with those of other features (NLD features, prosodic features, and MFCC features). Finally, the Korean isolated words database, the Berlin emotional speech database, and the CASIA emotional speech database were chosen for validation. The effectiveness of the NLD features was tested using the Support Vector Machine classifier. The results show that NLD features not only have high recognition rate and excellent antinoise performance for speech recognition tasks but also can fully characterize the different emotions contained in speech signals.

Funder

National Natural Science Foundation of China

Publisher

Hindawi Limited

Subject

General Engineering,General Mathematics

Link

http://downloads.hindawi.com/journals/mpe/2020/9452976.pdf

Reference27 articles.

1. Manipulation of the prosodic features of vocal tract length, nasality and articulatory precision using articulatory synthesis

2. Speech emotion recognition using amplitude modulation parameters and a combined feature selection procedure

3. Automatic Voice Pathology Detection With Running Speech by Using Estimation of Auditory Spectrum and Cepstral Coefficients Based on the All-Pole Model