Abstract
In this work, we present an approach to understand the computational methods and decision-making involved in the identification of emotions in spontaneous speech. The selected task consists of Spanish TV debates, which entail a high level of complexity as well as additional subjectivity in the human perception-based annotation procedure. A simple convolutional neural model is proposed, and its behaviour is analysed to explain its decision-making. The proposed model slightly outperforms commonly used CNN architectures such as VGG16, while being much lighter. Internal layer-by-layer transformations of the input spectrogram are visualised and analysed. Finally, a class model visualisation is proposed as a simple interpretation approach whose usefulness is assessed in the work.
Funder
Spanish Minister of Science
European Union’s
University of the Basque Country UPV/EHU
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference83 articles.
1. Moors, A. (2012). Categorical versus Dimensional Models of Affect: A Seminar on the Theories of Panksepp and Russell, John Benjamins.
2. de Velasco, M., Justo, R., and Inés Torres, M. (2022). Automatic Identification of Emotional Information in Spanish TV Debates and Human-Machine Interactions. Appl. Sci., 12.
3. Basic emotions;Ekman;Handbook of Cognition and Emotion,1999
4. Core affect and the psychological construction of emotion;Russell;Psychol. Rev.,2003
5. Raheel, A., Majid, M., Alnowami, M., and Anwar, S.M. (2020). Physiological sensors based emotion recognition while experiencing tactile enhanced multimedia. Sensors, 20.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献