Affiliation:
1. School of Mechanical and Electronic Engineering, Wuhan University of Technology, Wuhan, China
2. Wuhan Baohua Display Technology Co., Ltd, Wuhan, China
Abstract
In order to reduce the incidence of traffic accidents caused by the emotional state of drivers, this study proposes an emotion recognition algorithm based on vehicle noise environment. This algorithm can effectively identify the emotional state of drivers and provide support for further improving their emotions. To address challenges in existing research on speech emotion recognition, such as excessive model parameters, poor generalization, and suboptimal performance in noisy environments, this paper proposes a lightweight network model suitable for small datasets. The model utilizes Power Normalized Cepstral Coefficients (PNCC) as input features, and employs parallel feature extraction layers at different scales. These features are then fed into a feature learning module for in-depth extraction, with the final determination of the driver’s emotional state made by the output layer. Experimental results show that the model achieves an accuracy of 96.08% on the EMO-DB speech dataset. Even in simulated in-vehicle noise environments, the model exhibits high accuracy and robustness. Moreover, compared to other lightweight models, it has fewer training parameters and faster processing speed, making it suitable for deployment on edge devices in mobile applications.