Abstract
AbstractStudying the real-time face expression state of teachers in class was important to build an objective classroom teaching evaluation system based on AI. However, the face-to-face communication in classroom conditions was a real-time process that operated on a millisecond time scale. Therefore, in order to quickly and accurately predict teachers’ facial expressions in real time, this paper proposed an improved YOLOv5 network, which introduced the attention mechanisms into the Backbone model of YOLOv5. In experiments, we investigated the effects of different attention mechanisms on YOLOv5 by adding different attention mechanisms after each CBS module in the CSP1_X structure of the Backbone part, respectively. At the same time, the attention mechanisms were incorporated at different locations of the Focus, CBS, and SPP modules of YOLOv5, respectively, to study the effects of the attention mechanism on different modules. The results showed that the network in which the coordinate attentions were incorporated after each CBS module in the CSP1_X structure obtained the detection time of 25 ms and the accuracy of 77.1% which increased by 3.5% compared with YOLOv5. It outperformed other networks, including Faster-RCNN, R-FCN, ResNext-101, DETR, Swin-Transformer, YOLOv3, and YOLOX. Finally, the real-time teachers’ facial expression recognition system was designed to detect and analyze the teachers’ facial expression distribution with time through camera and the teaching video.
Funder
Tianjin Science and Technology Program
National Education Science Planning
Publisher
Springer Science and Business Media LLC
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献