Sports Video Classification Method Based on Improved Deep Learning
-
Published:2024-01-22
Issue:2
Volume:14
Page:948
-
ISSN:2076-3417
-
Container-title:Applied Sciences
-
language:en
-
Short-container-title:Applied Sciences
Author:
Gao Tianhao1, Zhang Meng1, Zhu Yifan2, Zhang Youjian1, Pang Xiangsheng1ORCID, Ying Jing2, Liu Wenming1
Affiliation:
1. Department of Sport Science, College of Education, Zhejiang University, Hangzhou 310027, China 2. College of Computer Science, Zhejiang University, Hangzhou 310027, China
Abstract
Classifying sports videos is complex due to their dynamic nature. Traditional methods, like optical flow and the Histogram of Oriented Gradient (HOG), are limited by their need for expertise and lack of universality. Deep learning, particularly Convolutional Neural Networks (CNNs), offers more effective feature recognition in sports videos, but standard CNNs struggle with fast-paced or low-resolution sports videos. Our novel neural network model addresses these challenges. It begins by selecting important frames from sports footage and applying a fuzzy noise reduction algorithm to enhance video quality. The model then uses a bifurcated neural network to extract detailed features, leading to a densely connected neural network with a specific activation function for categorizing videos. We tested our model on a High-Definition Sports Video Dataset covering over 20 sports and a low-resolution dataset. Our model outperformed established classifiers like DenseNet, VggNet, Inception v3, and ResNet-50. It achieved high precision (0.9718), accuracy (0.9804), F-score (0.9761), and recall (0.9723) on the high-resolution dataset, and significantly better precision (0.8725) on the low-resolution dataset. Correspondingly, the highest values on the matrix of four traditional models are: precision (0.9690), accuracy (0.9781), F-score (0.9670), recall (0.9681) on the high-resolution dataset, and precision (0.8627) on the low-resolution dataset. This demonstrates our model’s superior performance in sports video classification under various conditions, including rapid motion and low resolution. It marks a significant step forward in sports data analytics and content categorization.
Reference26 articles.
1. Qiu, Z., Yao, T., and Mei, T. (2017, January 22–29). Learning spatio-temporal representation with pseudo-3D residual networks. Proceedings of the IEEE International Conference on Computer Vision 2017, Venice, Italy. 2. Bagautdinov, T.M., Alahi, A., Fleuret, F., Fua, P., and Savarese, S. (2017, January 21–26). Social scene understanding: End-to-End multi-person action localization and collective activity recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA. 3. Tang, Y., Wang, Z., Li, P., Lu, J., Yang, M., and Zhou, J. (2018, January 22–26). Mining semantics-preserving attention for group activity recognition. Proceedings of the 26th ACM International Conference on Multimedia Multimedia, Seoul, Republic of Korea. 4. Cao, S., Wang, B., Zhang, W., and Ma, L. (March, January 22). Visual consensus modeling for video-text retrieval. Proceedings of the AAAI Conference on Artificial Intelligence, Virtually. 5. Yu, H., Cheng, S., Ni, B., Wang, M., Zhang, J., and Yang, X. (2018, January 18–22). Fine-Grained video captioning for sports narrative. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|