Dense Temporal Convolution Network for Sign Language Translation-Reference-Cited by-同舟云学术

Dense Temporal Convolution Network for Sign Language Translation

Published:2019-08 Issue: Volume: Page:
ISSN:
Container-title:Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
language:
Short-container-title:

Author:

Guo Dan¹,Wang Shuo¹,Tian Qi²³,Wang Meng¹

Affiliation:

1. School of Computer Science and Information Engineering, Hefei University of Technology

2. Huawei Noah’s Ark Lab Department of Computer Science

3. University of Texas at San Antonio

Abstract

The sign language translation (SLT) which aims at translating a sign language video into natural language is a weakly supervised task, given that there is no exact mapping relationship between visual actions and textual words in a sentence label. To align the sign language actions and translate them into the respective words automatically, this paper proposes a dense temporal convolution network, termed DenseTCN which captures the actions in hierarchical views. Within this network, a temporal convolution (TC) is designed to learn the short-term correlation among adjacent features and further extended to a dense hierarchical structure. In the kth TC layer, we integrate the outputs of all preceding layers together: (1) The TC in a deeper layer essentially has larger receptive fields, which captures long-term temporal context by the hierarchical content transition. (2) The integration addresses the SLT problem by different views, including embedded short-term and extended longterm sequential learning. Finally, we adopt the CTC loss and a fusion strategy to learn the featurewise classification and generate the translated sentence. The experimental results on two popular sign language benchmarks, i.e. PHOENIX and USTCConSents, demonstrate the effectiveness of our proposed method in terms of various measurements.

Publisher

International Joint Conferences on Artificial Intelligence Organization

Cited by 46 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Semantic-driven diffusion for sign language production with gloss-pose latent spaces alignment;Computer Vision and Image Understanding;2024-09

2. Temporal superimposed crossover module for effective continuous sign language;Machine Vision and Applications;2024-08-19

3. Inclusive Deaf Education Enabled by Artificial Intelligence: The Path to a Solution;International Journal of Artificial Intelligence in Education;2024-07-24

4. Benchmarking Micro-Action Recognition: Dataset, Methods, and Applications;IEEE Transactions on Circuits and Systems for Video Technology;2024-07

5. Dual-stage temporal perception network for continuous sign language recognition;The Visual Computer;2024-06-08