A Spatio-Temporal Motion Network for Action Recognition Based on Spatial Attention-Reference-Cited by-同舟云学术

A Spatio-Temporal Motion Network for Action Recognition Based on Spatial Attention

Published:2022-03-04 Issue:3 Volume:24 Page:368
ISSN:1099-4300
Container-title:Entropy
language:en
Short-container-title:Entropy

Author:

Yang Qi,Lu Tongwei,Zhou Huabing

Abstract

Temporal modeling is the key for action recognition in videos, but traditional 2D CNNs do not capture temporal relationships well. 3D CNNs can achieve good performance, but are computationally intensive and not well practiced on existing devices. Based on these problems, we design a generic and effective module called spatio-temporal motion network (SMNet). SMNet maintains the complexity of 2D and reduces the computational effort of the algorithm while achieving performance comparable to 3D CNNs. SMNet contains a spatio-temporal excitation module (SE) and a motion excitation module (ME). The SE module uses group convolution to fuse temporal information to reduce the number of parameters in the network, and uses spatial attention to extract spatial information. The ME module uses the difference between adjacent frames to extract feature-level motion patterns between adjacent frames, which can effectively encode motion features and help identify actions efficiently. We use ResNet-50 as the backbone network and insert SMNet into the residual blocks to form a simple and effective action network. The experiment results on three datasets, namely Something-Something V1, Something-Something V2, and Kinetics-400, show that it out performs state-of-the-arts motion recognition networks.

Funder

National Natural Science Foundation of China

Hubei Technology Innovation Project

Publisher

MDPI AG

Subject

General Physics and Astronomy

Link

https://www.mdpi.com/1099-4300/24/3/368/pdf

Reference46 articles.

1. Two-stream convolutional networks for action recognition in videos;Simonyan;Adv. Neural Inf. Process. Syst.,2014

2. Temporal segment networks: Towards good practices for deep action recognition;Wang;Comput. Vis.,2016

3. Towards Good Practices for Very Deep Two-Stream ConvNets;Wang;arXiv,2015

4. 3D Convolutional Neural Networks for Human Action Recognition

Cited by 10 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Automated Laryngeal Invasion Detector of Boluses in Videofluoroscopic Swallowing Study Videos Using Action Recognition-Based Networks;Diagnostics;2024-07-06

2. Towards Autonomous Driving: Technologies and Data for Vehicles-to-Everything Communication;Sensors;2024-05-25

3. A dual-difference change detection network for detecting building changes on high-resolution remote sensing images;Geocarto International;2024-01

4. A Vision-Based Method for Human Activity Recognition Using Local Binary Pattern;2023 13th International Conference on Computer and Knowledge Engineering (ICCKE);2023-11-01

5. Action recognition based on adaptive region perception;Neural Computing and Applications;2023-10-09