Abstract
In the field of video motion recognition, with the increase in network depth, there is an asymmetry in the amount of parameters and their accuracy. We propose the structure of micro-attention branches and a method of integrating attention branches in multi-branches 3D convolution networks. Our proposed attention branches can improve this asymmetry problem. Attention branches can be flexibly added to 3D convolution network in the form of plug-in, without changing the overall structure of the original network. Through this structure, the newly constructed network can fuse the attention features extracted by attention branches in real time in the process of feature extraction. By adding attention branches, the model can focus on the action subject more accurately, so as to improve the accuracy of the model. Moreover, in the case that there are multiple sub-branches in the construction module of the existing network, 3D micro-attention branches can well adapt to this scenario. In the kinetics dataset, we use our proposed micro-attention branch structure to construct a deep network, which is compared with the original network. The experimental results show that the recognition accuracy of the network with micro-attention branches is improved by 3.6% compared with the original network, while the amount of parameters to be trained is only increased by 0.6%.
Funder
National Natural Science Foundation of China
Subject
Physics and Astronomy (miscellaneous),General Mathematics,Chemistry (miscellaneous),Computer Science (miscellaneous)
Reference42 articles.
1. 3D Convolutional Neural Networks for Human Action Recognition
2. Sequential deep learning for human action recognition;Baccouche,2011
3. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
4. Temporal 3D ConvNets: New Architecture and Transfer Learning for Video Classification;Diba;arXiv,2017
5. SlowFast Networks for Video Recognition
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献