Author:
YANG Yifan,ZHANG Tao,LI Weiyu
Abstract
Computer vision, a scientific discipline enables machines to perceive visual information, aims to supplant human eyes in tasks encompassing object recognition, localization, and tracking. In traditional educational settings, instructors or evaluators evaluate teaching performance based on subjective judgment. However, with the continuous advancements in computer vision technology, it becomes increasingly crucial for computers to take on the role of judges in obtaining vital information and making unbiased evaluations. Against this backdrop, this paper proposes a deep learning-based approach for evaluating lecture posture. First, feature information is extracted from various dimensions, including head position, hand gestures, and body posture, using a human pose estimation algorithm. Second, a machine learning-based regression model is employed to predict machine scores by comparing the extracted features with expert-assigned human scores. The correlation between machine scores and human scores is investigated through experiment and analysis, revealing a robust overall correlation (0.642 0) between predicted machine scores and human scores. Under ideal scoring conditions (100 points), approximately 51.72% of predicted machine scores exhibited deviations within a range of 10 points, while around 81.87% displayed deviations within a range of 20 points; only a minimal percentage of 0.12% demonstrated deviations exceeding the threshold of 50 points. Finally, to further optimize performance, additional features related to bodily movements are extracted by introducing facial expression recognition and gesture recognition algorithms. The fusion of multiple models resulted in an overall average correlation improvement of 0.022 6.
Reference17 articles.
1. Zhang X Q, Li C C, Tong X F, et al. Efficient human pose estimation via parsing a tree structure based human model[C]//2009 IEEE 12th International Conference on Computer Vision. New York: IEEE, 2009: 1349-1356.
2. Sun M, Kohli P, Shotton J. Conditional regression forests for human pose estimation[C]//2012 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2012: 3394-3401.
3. Dantone M, Gall J, Leistner C, et al. Human pose estimation using body parts dependent joint regressors[C]//2013 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2013: 3041-3048.
4. ImageNet classification with deep convolutional neural networks
5. Tompson J, Jain A, LeCun Y, et al. Joint training of a convolutional network and a graphical model for human pose estimation[EB/OL]. [2023-05-20]. https://arxiv.org/pdf/1406.2984.