Attention-Based Temporal Encoding Network with Background-Independent Motion Mask for Action Recognition

Author:

Weng Zhengkui1ORCID,Jin Zhipeng1,Chen Shuangxi1,Shen Quanquan1,Ren Xiangyang23,Li Wuzhao4

Affiliation:

1. Jiaxing Vocational and Technical College, Jiaxing, Zhejiang, China

2. Medical 3D Printing Center, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China

3. School of Electrical Engineering, Zhengzhou University, Zhengzhou, Henan, China

4. Wenzhou Polytechnic, Wenzhou, Zhejiang, China

Abstract

Convolutional neural network (CNN) has been leaping forward in recent years. However, the high dimensionality, rich human dynamic characteristics, and various kinds of background interference increase difficulty for traditional CNNs in capturing complicated motion data in videos. A novel framework named the attention-based temporal encoding network (ATEN) with background-independent motion mask (BIMM) is proposed to achieve video action recognition here. Initially, we introduce one motion segmenting approach on the basis of boundary prior by associating with the minimal geodesic distance inside a weighted graph that is not directed. Then, we propose one dynamic contrast segmenting strategic procedure for segmenting the object that moves within complicated environments. Subsequently, we build the BIMM for enhancing the object that moves based on the suppression of the not relevant background inside the respective frame. Furthermore, we design one long-range attention system inside ATEN, capable of effectively remedying the dependency of sophisticated actions that are not periodic in a long term based on the more automatic focus on the semantical vital frames other than the equal process for overall sampled frames. For this reason, the attention mechanism is capable of suppressing the temporal redundancy and highlighting the discriminative frames. Lastly, the framework is assessed by using HMDB51 and UCF101 datasets. As revealed from the experimentally achieved results, our ATEN with BIMM gains 94.5% and 70.6% accuracy, respectively, which outperforms a number of existing methods on both datasets.

Funder

Natural Science Foundation of Zhejiang Province

Publisher

Hindawi Limited

Subject

General Mathematics,General Medicine,General Neuroscience,General Computer Science

Reference52 articles.

1. Action recognition by dense trajectories;A. K. Wang

2. A survey of video datasets for human action and activity recognition

3. Beyond Frame-level CNN: Saliency-Aware 3-D CNN With LSTM for Video Action Recognition

4. Human action recognition using factorized spatio-temporal convolution networks;L. Sun

5. Gmental Spatiotemporal CNNs for fine-grained action segmentation;C. Lea

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3