Emotion recognition from speech signals using digital features optimization by diversity measure fusion

Author:

Konduru Ashok Kumar1,Mazher Iqbal J.L.2

Affiliation:

1. Veltech Rangarajan Dr. Sagunthala R&D Institute of Science and Technology, Avadi, Chennai, India

2. ECE, Veltech Rangarajan Dr. Sagunthala R&D Institute of Science and Technology, Avadi, Chennai, India

Abstract

Emotion recognition from speech signals serves a crucial role in human-computer interaction and behavioral studies. The task, however, presents significant challenges due to the high dimensionality and noisy nature of speech data. This article presents a comprehensive study and analysis of a novel approach, “Digital Features Optimization by Diversity Measure Fusion (DFOFDM)”, aimed at addressing these challenges. The paper begins by elucidating the necessity for improved emotion recognition methods, followed by a detailed introduction to DFOFDM. This approach employs acoustic and spectral features from speech signals, coupled with an optimized feature selection process using a fusion of diversity measures. The study’s central method involves a Cuckoo Search-based classification strategy, which is tailored for this multi-label problem. The performance of the proposed DFOFDM approach is evaluated extensively. Emotion labels such as ‘Angry’, ‘Happy’, and ‘Neutral’ showed a precision rate over 92%, while other emotions fell within the range of 87% to 90%. Similar performance was observed in terms of recall, with most emotions falling within the 90% to 95% range. The F-Score, another crucial metric, also reflected comparable statistics for each label. Notably, the DFOFDM model showed resilience to label imbalances and noise in speech data, crucial for real-world applications. When compared with a contemporary model, “Transfer Subspace Learning by Least Square Loss (TSLSL)”, DFOFDM displayed superior results across various evaluation metrics, indicating a promising improvement in the field of speech emotion recognition. In terms of computational complexity, DFOFDM demonstrated effective scalability, providing a feasible solution for large-scale applications. Despite its effectiveness, the study acknowledges the potential limitations of the DFOFDM, which might influence its performance on certain types of real-world data. The findings underline the potential of DFOFDM in advancing emotion recognition techniques, indicating the necessity for further research.

Publisher

IOS Press

Subject

Artificial Intelligence,General Engineering,Statistics and Probability

Reference36 articles.

1. , Speech emotion recognition using deep neural network considering verbal and nonverbal speech sounds;Huang;ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),2019

2. , Robust emotion recognition by spectro-temporal modulation statistic features;Chi;Journal of Ambient Intelligence and Humanized Computing,2012

3. S. R., Acoustical properties of speech as indicators of depression and suicidal risk;France;IEEE transactions on Biomedical Engineering,2000

4. C. D., Icarus: Source generator based real-time recognition of speech in noisy stressful and lombard effect environments;Hansen;Speech communication,1995

5. Feature fusion methods research based on deep belief networks for speech emotion recognition under noise condition;Huang;Journal of Ambient Intelligence and Humanized Computing,2019

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3