Multi-Scale Attention 3D Convolutional Network for Multimodal Gesture Recognition-Reference-Cited by-同舟云学术

Multi-Scale Attention 3D Convolutional Network for Multimodal Gesture Recognition

Published:2022-03-21 Issue:6 Volume:22 Page:2405
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Chen Huizhou,Li Yunan^ORCID,Fang Huijuan,Xin Wentian,Lu Zixiang^ORCID,Miao Qiguang^ORCID

Abstract

Gesture recognition is an important direction in computer vision research. Information from the hands is crucial in this task. However, current methods consistently achieve attention on hand regions based on estimated keypoints, which will significantly increase both time and complexity, and may lose position information of the hand due to wrong keypoint estimations. Moreover, for dynamic gesture recognition, it is not enough to consider only the attention in the spatial dimension. This paper proposes a multi-scale attention 3D convolutional network for gesture recognition, with a fusion of multimodal data. The proposed network achieves attention mechanisms both locally and globally. The local attention leverages the hand information extracted by the hand detector to focus on the hand region, and reduces the interference of gesture-irrelevant factors. Global attention is achieved in both the human-posture context and the channel context through a dual spatiotemporal attention module. Furthermore, to make full use of the differences between different modalities of data, we designed a multimodal fusion scheme to fuse the features of RGB and depth data. The proposed method is evaluated using the Chalearn LAP Isolated Gesture Dataset and the Briareo Dataset. Experiments on these two datasets prove the effectiveness of our network and show it outperforms many state-of-the-art methods.

Funder

National Natural Science Foundations of China

Fundamental Research Funds for the Central Universities

China Postdoctoral Science Foundation

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry

Link

https://www.mdpi.com/1424-8220/22/6/2405/pdf

Reference33 articles.

1. Regional Attention with Architecture-Rebuilt 3D Network for RGB-D Gesture Recognition;Zhou;arXiv,2021