Sign Language Dataset for Automatic Motion Generation
-
Published:2023-11-27
Issue:12
Volume:9
Page:262
-
ISSN:2313-433X
-
Container-title:Journal of Imaging
-
language:en
-
Short-container-title:J. Imaging
Author:
Villa-Monedero María1, Gil-Martín Manuel1ORCID, Sáez-Trigueros Daniel2, Pomirski Andrzej3ORCID, San-Segundo Rubén1ORCID
Affiliation:
1. Grupo de Tecnología del Habla y Aprendizaje Automático (T.H.A.U. Group), Information Processing and Telecommunications Center, E.T.S.I. de Telecomunicación, Universidad Politécnica de Madrid, 28040 Madrid, Spain 2. Alexa AI, C. de Ramírez de Prado, 5, 28045 Madrid, Spain 3. Alexa AI, Aleja Grunwaldzka 472, 80-309 Gdańsk, Poland
Abstract
Several sign language datasets are available in the literature. Most of them are designed for sign language recognition and translation. This paper presents a new sign language dataset for automatic motion generation. This dataset includes phonemes for each sign (specified in HamNoSys, a transcription system developed at the University of Hamburg, Hamburg, Germany) and the corresponding motion information. The motion information includes sign videos and the sequence of extracted landmarks associated with relevant points of the skeleton (including face, arms, hands, and fingers). The dataset includes signs from three different subjects in three different positions, performing 754 signs including the entire alphabet, numbers from 0 to 100, numbers for hour specification, months, and weekdays, and the most frequent signs used in Spanish Sign Language (LSE). In total, there are 6786 videos and their corresponding phonemes (HamNoSys annotations). From each video, a sequence of landmarks was extracted using MediaPipe. The dataset allows training an automatic system for motion generation from sign language phonemes. This paper also presents preliminary results in motion generation from sign phonemes obtaining a Dynamic Time Warping distance per frame of 0.37.
Subject
Electrical and Electronic Engineering,Computer Graphics and Computer-Aided Design,Computer Vision and Pattern Recognition,Radiology, Nuclear Medicine and imaging
Reference25 articles.
1. Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments;Ionescu;IEEE Trans. Pattern Anal. Mach. Intell.,2014 2. The KIT Motion-Language Dataset;Plappert;Big Data,2016 3. Guo, C., Zou, S.H., Zuo, X.X., Wang, S., Ji, W., Li, X.Y., and Cheng, L. (2022, January 18–24). Generating Diverse and Natural 3D Human Motions from Text. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA. 4. NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding;Liu;IEEE Trans. Pattern Anal. Mach. Intell.,2020 5. Punnakkal, A.R., Chandrasekaran, A., Athanasiou, N., Quiros-Ramirez, A., and Black, M.J. (2021, January 19–25). BABEL: Bodies, Action and Behavior with English Labels. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Virtual.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|