Abstract
Aiming at the proplem of high difficulty and low real-time in dynamic expressions recognition under classroom scenes, a lightweight dynamic expression recognition algorithm based on improved MobileNetv3 is proposed. Firstly, through embedding a feature extraction module of GRU (Gate Recurrent Unit) in the MobileNetv3 network, the corresponding space vector of each expression image is processed, the temporal feature among expression image sequences is extracted, and the expression characteristics over time are fully explored. Then, a new hybrid loss LMCF (Large Margin Cosine Focal Loss) is proposed to build the hypersphere of facial expression features, and the inter-class distance of expressions is increased by enlarging the cosine distance, while the blurring problem of inter-class feature boundaries caused by unbalanced expression data is alleviated. Finally, a sparsely connected Pointwise Group Convolution is adopted to optimize the depthwise separable convolution in MobileNetv3 network, the model complexity is reduced, and the model speed is improved. The experimental results show that the accuracy and speed of the proposed algorithm are better than those of the other algorithms in the classroom scene test set, the mean average precision (mAP) can be improved by up to 2.88%, and the recognition rate can be improved by up to 12 FPS.