Author:
Garcia-Cuesta Esteban,Salvador Antonio Barba,Pãez Diego Gachet
Abstract
AbstractIn this paper we present a new speech emotion dataset on Spanish. The database is created using an elicited approach and is composed by fifty non-actors expressing the Ekman’s six basic emotions of anger, disgust, fear, happiness, sadness, and surprise, plus neutral tone. This article describes how this database has been created from the recording step to the performed crowdsourcing perception test step. The crowdsourcing has facilitated to statistically validate the emotion of each collected audio sample and also to filter noisy data samples. Hence we obtained two datasets EmoSpanishDB and EmoMatchSpanishDB. The first includes those recorded audios that had consensus during the crowdsourcing process. The second selects from EmoSpanishDB only those audios whose emotion also matches with the originally elicited. Last, we present a baseline comparative study between different state of the art machine learning techniques in terms of accuracy, precision, and recall for both datasets. The results obtained for EmoMatchSpanishDB improves the ones obtained for EmoSpanishDB and thereof, we recommend to follow the methodology that was used for the creation of emotional databases.
Funder
Universidad Europea de Madrid
Universidad Politécnica de Madrid
Publisher
Springer Science and Business Media LLC
Subject
Computer Networks and Communications,Hardware and Architecture,Media Technology,Software
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献