Abstract
AbstractThe intersection of psychology and English teaching is profound, as the application of psychological principles not only guides specific English instruction but also elevates the overall quality of teaching. This paper takes a multimodal approach, incorporating image, acoustics, and text information, to construct a joint analysis model for English teaching interaction and psychological characteristics. The novel addition of an attention mechanism in the multimodal fusion process enables the development of an English teaching psychological characteristics recognition model. The initial step involves balancing the proportions of each emotion, followed by achieving multimodal alignment. In the cross-modal stage, the interaction of image, acoustics, and text is facilitated through a cross-modal attention mechanism. The utilization of a multi-attention mechanism not only enhances the network’s representation capabilities but also streamlines the complexity of the model. Empirical results demonstrate the model’s proficiency in accurately identifying five psychological characteristics. The proposed method achieves a classification accuracy of 90.40% for psychological features, with a commendable accuracy of 78.47% in multimodal classification. Furthermore, the incorporation of the attention mechanism in feature fusion contributes to an improved fusion effect.
Publisher
Springer Science and Business Media LLC