Deep Clustering Efficient Learning Network for Motion Recognition Based on Self-Attention Mechanism
-
Published:2023-02-26
Issue:5
Volume:13
Page:2996
-
ISSN:2076-3417
-
Container-title:Applied Sciences
-
language:en
-
Short-container-title:Applied Sciences
Author:
Ru Tielin1ORCID, Zhu Ziheng2
Affiliation:
1. Sports Department, Xi’an University of Science and Technology, Xi’an 710054, China 2. College of Computer Science and Technology, Xidian University, Xi’an 710071, China
Abstract
Multi-person behavior event recognition has become an increasingly challenging research field in human–computer interaction. With the rapid development of deep learning and computer vision, it plays an important role in the inference and analysis of real sports events, that is, given the video frequency of sports events, when letting it analyze and judge the behavior trend of athletes, often faced with the limitations of large-scale data sets and hardware, it takes a lot of time, and the accuracy of the results is not high. Therefore, we propose a deep clustering learning network for motion recognition under the self-attention mechanism, which can efficiently solve the accuracy and efficiency problems of sports event analysis and judgment. This method can not only solve the problem of gradient disappearance and explosion in the recurrent neural network (RNN), but also capture the internal correlation between multiple people on the sports field for identification, etc., by using the long and short-term memory network (LSTM), and combine the motion coding information in the key frames with the deep embedded clustering (DEC) to better analyze and judge the complex behavior change types of athletes. In addition, by using the self-attention mechanism, we can not only analyze the whole process of the sports video macroscopically, but also focus on the specific attributes of the movement, extract the key posture features of the athletes, further enhance the features, effectively reduce the amount of parameters in the calculation process of self-attention, reduce the computational complexity, and maintain the ability to capture details. The accuracy and efficiency of reasoning and judgment are improved. Through verification on large video datasets of mainstream sports, we achieved high accuracy and improved the efficiency of inference and prediction. It is proved that the method is effective and feasible in the analysis and reasoning of sports videos.
Funder
Shaanxi Provincial Soft Science Research Plan: "Under the Healthy China 2030 Strategy" Shaanxi Provincial Mass Sports and Health Service Industry Integration and Innovation Research
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference43 articles.
1. Fei, H., Reardon, C., Parker, L.E., and Hao, Z. (June, January 29). Minimum uncertainty latent variable models for robot recognition of sequential human activities. Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore. 2. Li, X., and Chuah, M.C. (2017, January 22–29). Sbgar: Semantics based group activity recognition. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy. 3. Host–parasite: Graph lstm-in-lstm for group activity recognition;Shu;IEEE Trans. Neural Netw. Learn. Syst.,2020 4. Wang, M., Ni, B., and Yang, X. (2017, January 21–26). Recurrent modeling of interaction context for collective activity recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA. 5. Yu, H., Cheng, S., Ni, B., Wang, M., Zhang, J., and Yang, X. (2018, January 18–23). Fine-grained video captioning for sports narrative. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|