Spatial-Temporal Multi-Cue Network for Continuous Sign Language Recognition-Reference-Cited by-同舟云学术

Spatial-Temporal Multi-Cue Network for Continuous Sign Language Recognition

Published:2020-04-03 Issue:07 Volume:34 Page:13009-13016
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Zhou Hao,Zhou Wengang,Zhou Yun,Li Houqiang

Abstract

Despite the recent success of deep learning in continuous sign language recognition (CSLR), deep models typically focus on the most discriminative features, ignoring other potentially non-trivial and informative contents. Such characteristic heavily constrains their capability to learn implicit visual grammars behind the collaboration of different visual cues (i,e., hand shape, facial expression and body posture). By injecting multi-cue learning into neural network design, we propose a spatial-temporal multi-cue (STMC) network to solve the vision-based sequence learning problem. Our STMC network consists of a spatial multi-cue (SMC) module and a temporal multi-cue (TMC) module. The SMC module is dedicated to spatial representation and explicitly decomposes visual features of different cues with the aid of a self-contained pose estimation branch. The TMC module models temporal correlations along two parallel paths, i.e., intra-cue and inter-cue, which aims to preserve the uniqueness and explore the collaboration of multiple cues. Finally, we design a joint optimization strategy to achieve the end-to-end sequence learning of the STMC network. To validate the effectiveness, we perform experiments on three large-scale CSLR benchmarks: PHOENIX-2014, CSL and PHOENIX-2014-T. Experimental results demonstrate that the proposed method achieves new state-of-the-art performance on all three benchmarks.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 69 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Improving Continuous Sign Language Recognition with Consistency Constraints and Signer Removal;ACM Transactions on Multimedia Computing, Communications, and Applications;2024-01-15

2. Scalable frame resolution for efficient continuous sign language recognition;Pattern Recognition;2024-01

3. KSRB-Net: a continuous sign language recognition deep learning strategy based on motion perception mechanism;The Visual Computer;2023-12-26

4. A survey on sign language literature;Machine Learning with Applications;2023-12

5. Visual feature segmentation with reinforcement learning for continuous sign language recognition;International Journal of Multimedia Information Retrieval;2023-11-18