Affiliation:
1. Music College, Hubei Normal University, Huangshi, Hubei, China
Abstract
In the contemporary landscape of diversified talent cultivation, enhancing education through intelligent means and expediting the process of talent development stand as paramount pursuits. Within the domain of instrumental music education, beyond merely listening to student performances, it becomes imperative to assess their movements, thus furnishing additional insights to fuel their subsequent growth. This article introduces a novel multimodal information fusion evaluation approach, combining sound information and movement data to address the challenge of evaluating students’ learning status in college music instruction. The proposed framework leverages Internet of Things (IoT) technology, utilizing strategically positioned microphones and cameras within the local area network to accomplish data acquisition. Sound feature extraction is accomplished through the employment of Mel-scale frequency cepstral coefficients (MFCC), while the OpenPose framework in deep learning and convolutional neural networks (CNN) is harnessed to extract action features during students’ performances. Subsequently, the fusion of feature layers is achieved through CNN, culminating in the evaluation of students’ academic efficacy, facilitated by a fully connected network (FCN) and an activation function. In comparison to evaluations conducted by the teacher in the class, this approach achieves an impressive accuracy of 95.7% across the three categories of Excellent, Good, and Failed students’ evaluation processes. This breakthrough offers novel insights for the future of music teaching and interactive class evaluations while expanding the horizons of multimodal information fusion methods’ applications.
Reference30 articles.
1. Human activity recognition using temporal convolutional neural network architecture;Andrade-Ambriz;Expert Systems with Applications,2022
2. Multi-modal sensor-based assessment of surgical skill using deep neural network;Bao;Medical & Biological Engineering & Computing,2019
3. Intelligent music teaching system with multimodal physiological signal analysis;Chen;International Journal of Human-Computer Interaction,2019
4. Fusing MFCC and LPC features using 1D triplet CNN for speaker recognition in severely degraded audio signals;Chowdhury;IEEE Transactions on Information Forensics and Security,2019
5. Human action recognition using two-stream attention based LSTM networks;Dai;Applied Soft Computing,2020