SUST Bangla Emotional Speech Corpus (SUBESCO): An audio-only emotional speech corpus for Bangla-Reference-Cited by-同舟云学术

SUST Bangla Emotional Speech Corpus (SUBESCO): An audio-only emotional speech corpus for Bangla

Published:2021-04-30 Issue:4 Volume:16 Page:e0250173
ISSN:1932-6203
Container-title:PLOS ONE
language:en
Short-container-title:PLoS ONE

Author:

Sultana Sadia^ORCID,Rahman M. Shahidur,Selim M. Reza,Iqbal M. Zafar

Abstract

SUBESCO is an audio-only emotional speech corpus for Bangla language. The total duration of the corpus is in excess of 7 hours containing 7000 utterances, and it is the largest emotional speech corpus available for this language. Twenty native speakers participated in the gender-balanced set, each recording of 10 sentences simulating seven targeted emotions. Fifty university students participated in the evaluation of this corpus. Each audio clip of this corpus, except those of Disgust emotion, was validated four times by male and female raters. Raw hit rates and unbiased rates were calculated producing scores above chance level of responses. Overall recognition rate was reported to be above 70% for human perception tests. Kappa statistics and intra-class correlation coefficient scores indicated high-level of inter-rater reliability and consistency of this corpus evaluation. SUBESCO is an Open Access database, licensed under Creative Common Attribution 4.0 International, and can be downloaded free of charge from the web link: https://doi.org/10.5281/zenodo.4526477.

Funder

Higher Education Quality Enhancement Project for the Development of MultiPlatform Speech and Language Processing Software for Bangla

Shahjalal University of Science and Technology (SUST) Research Center

Publisher

Public Library of Science (PLoS)

Subject

Multidisciplinary

Reference55 articles.

1. Emotion inferences from vocal expression correlate across languages and cultures;KR Scherer;Journal of Cross-cultural psychology,2001

2. Comparison of emotion perception among different cultures;J Dang;Acoustical science and technology,2010

3. Emotional speech: Towards a new generation of databases;E Douglas-Cowie;Speech communication,2003

4. Neumann M, et al. Cross-lingual and multilingual speech emotion recognition on english and french. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE; 2018. p. 5769–5773.

5. Parry J, Palaz D, Clarke G, Lecomte P, Mead R, Berger M, et al. Analysis of Deep Learning Architectures for Cross-Corpus Speech Emotion Recognition. In: INTERSPEECH; 2019. p. 1656–1660.

Cited by 35 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Smart reception: An artificial intelligence driven bangla language based receptionist system employing speech, speaker, and face recognition for automating reception services;Engineering Applications of Artificial Intelligence;2024-10

2. Natural Language Processing for Recognizing Bangla Speech with Regular and Regional Dialects: A Survey of Algorithms and Approaches;2024 IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC);2024-07-02

3. Assessing the effectiveness of ensembles in Speech Emotion Recognition: Performance analysis under challenging scenarios;Expert Systems with Applications;2024-06

4. EmoBone: A Multinational Audio Dataset of Emotional Bone Conducted Speech;IEEJ Transactions on Electrical and Electronic Engineering;2024-05-29

5. Emotion recognition for human–computer interaction using high-level descriptors;Scientific Reports;2024-05-27