Enhancing Emotion Recognition through Federated Learning: A Multimodal Approach with Convolutional Neural Networks-Reference-Cited by-同舟云学术

Enhancing Emotion Recognition through Federated Learning: A Multimodal Approach with Convolutional Neural Networks

Published:2024-02-06 Issue:4 Volume:14 Page:1325
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Simić Nikola¹^ORCID,Suzić Siniša¹,Milošević Nemanja²,Stanojev Vuk¹^ORCID,Nosek Tijana¹,Popović Branislav¹^ORCID,Bajović Dragana¹

Affiliation:

1. Faculty of Technical Sciences, University of Novi Sad, 21000 Novi Sad, Serbia

2. Faculty of Sciences, University of Novi Sad, 21000 Novi Sad, Serbia

Abstract

Human–machine interaction covers a range of applications in which machines should understand humans’ commands and predict their behavior. Humans commonly change their mood over time, which affects the way we interact, particularly by changing speech style and facial expressions. As interaction requires quick decisions, low latency is critical for real-time processing. Edge devices, strategically placed near the data source, minimize processing time, enabling real-time decision-making. Edge computing allows us to process data locally, thus reducing the need to send sensitive information further through the network. Despite the wide adoption of audio-only, video-only, and multimodal emotion recognition systems, there is a research gap in terms of analyzing lightweight models and solving privacy challenges to improve model performance. This motivated us to develop a privacy-preserving, lightweight, CNN-based (CNNs are frequently used for processing audio and video modalities) audiovisual emotion recognition model, deployable on constrained edge devices. The model is further paired with a federated learning protocol to preserve the privacy of local clients on edge devices and improve detection accuracy. The results show that the adoption of federated learning improved classification accuracy by ~2%, as well as that the proposed federated learning-based model provides competitive performance compared to other baseline audiovisual emotion recognition models.

Funder

European Union’s Horizon 2020 research

Publisher

MDPI AG

Link

https://www.mdpi.com/2076-3417/14/4/1325/pdf

Reference39 articles.

1. Automatic speech recognition: A survey;Malik;Multimed. Tools Appl.,2021

2. Integrating face and voice in person perception;Campanella;Trends Cogn. Sci.,2007

3. Survey on audiovisual emotion recognition: Databases, features, and data fusion strategies;Wu;APSIPA Trans. Signal Inf. Process.,2014

4. Audiovisual emotion recognition in wild;Avots;Mach. Vis. Appl.,2019

5. Leveraging recent advances in deep learning for audio-visual emotion recognition;Schoneveld;Pattern Recognit. Lett.,2021

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Federated regressive learning: Adaptive weight updates through statistical information of clients;Applied Soft Computing;2024-11

2. Multimodal Emotion Recognition Using Visual, Vocal and Physiological Signals: A Review;Applied Sciences;2024-09-09

3. A lightweight and privacy preserved federated learning ecosystem for analyzing verbal communication emotions in identical and non-identical databases;Measurement: Sensors;2024-08

4. Bio-Inspired Hyperparameter Tuning of Federated Learning for Student Activity Recognition in Online Exam Environment;AI;2024-07-01

5. A Multifaceted Survey on Federated Learning: Fundamentals, Paradigm Shifts, Practical Issues, Recent Developments, Partnerships, Trade-Offs, Trustworthiness, and Ways Forward;IEEE Access;2024