Affiliation:
1. Electronics and Communication Engineering, Birla Institute of Technology Mesra, India
2. Electronics and Communications Engineering, C V Raman Global University, Bhubaneswar, India
Abstract
Automatic speech emotion recognition (SER) is a crucial task in communication-based systems, where feature extraction plays an important role. Recently, a lot of SER models have been developed and implemented successfully in English and other western languages. However, the performance of the traditional Indian languages in SER is not up to the mark. This problem of SER in low-resource Indian languages mainly the Bengali language is dealt with in this paper. In the first step, the relevant phase-based information from the speech signal is extracted in the form of phase-based cepstral features (PBCC) using cepstral, and statistical analysis. Several pre-processing techniques are combined with features extraction and gradient boosting machine-based classifier in the proposed SER model. Finally, the evaluation and comparison of simulation results on speaker-dependent, speaker-independent tests are performed using multiple language datasets, and independent test sets. It is observed that the proposed PBCC features-based model is performing well with an average of 96% emotion recognition efficiency as compared to standard methods.
Publisher
Association for Computing Machinery (ACM)
Reference50 articles.
1. Gaurav Aggarwal Sarada Prasad Gochhayat and Latika Singh. 2021. Parameterization techniques for automatic speech recognition system. 209-250 pages. Gaurav Aggarwal Sarada Prasad Gochhayat and Latika Singh. 2021. Parameterization techniques for automatic speech recognition system. 209-250 pages.
2. Speech emotion recognition: Emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers
3. Bird Voice Classification Based on Combination Feature Extraction and Reduction Dimension with the K-Nearest;Andono Pulung Nurtantio;Neighbor. Int. J. Intell. Eng. Syst,2022
4. Moataz El Ayadi , Mohamed S Kamel , and Fakhri Karray . 2011. Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern recognition 44, 3 ( 2011 ), 572–587. Moataz El Ayadi, Mohamed S Kamel, and Fakhri Karray. 2011. Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern recognition 44, 3 (2011), 572–587.
5. Multilingual Speech Corpus in Low-Resource Eastern and Northeastern Indian Languages for Speaker and Language Identification
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. BSER: A Learning Framework for Bangla Speech Emotion Recognition;2024 6th International Conference on Electrical Engineering and Information & Communication Technology (ICEEICT);2024-05-02
2. Biomedical semantic text summarizer;BMC Bioinformatics;2024-04-16
3. Exploring Emotion and Emotional Variability as DigitalBiomarkers in Frontotemporal Dementia Speech;IEEE Access;2024
4. Improved Speech Emotion Recognition in Bengali Language using Deep Learning;2023 26th International Conference on Computer and Information Technology (ICCIT);2023-12-13