Generalized Pose Decoupled Network for Unsupervised 3D Skeleton Sequence-Based Action Representation Learning-Reference-Cited by-同舟云学术

Generalized Pose Decoupled Network for Unsupervised 3D Skeleton Sequence-Based Action Representation Learning

Published:2022-01 Issue: Volume:2022 Page:0002
ISSN:2692-7632
Container-title:Cyborg and Bionic Systems
language:en
Short-container-title:Cyborg Bionic Syst

Author:

Liu Mengyuan¹,Meng Fanyang²,Liang Yongsheng³

Affiliation:

1. Key Laboratory of Machine Perception, Peking University, Shenzhen Graduate School, Shenzhen, China.

2. Peng Cheng Laboratory, Shenzhen, China.

3. Harbin Institute of Technology, Harbin, China.

Abstract

Human action representation is derived from the description of human shape and motion. The traditional unsupervised 3-dimensional (3D) human action representation learning method uses a recurrent neural network (RNN)-based autoencoder to reconstruct the input pose sequence and then takes the midlevel feature of the autoencoder as representation. Although RNN can implicitly learn a certain amount of motion information, the extracted representation mainly describes the human shape and is insufficient to describe motion information. Therefore, we first present a handcrafted motion feature called pose flow to guide the reconstruction of the autoencoder, whose midlevel feature is expected to describe motion information. The performance is limited as we observe that actions can be distinctive in either motion direction or motion norm. For example, we can distinguish “sitting down” and “standing up” from motion direction yet distinguish “running” and “jogging” from motion norm. In these cases, it is difficult to learn distinctive features from pose flow where direction and norm are mixed. To this end, we present an explicit pose decoupled flow network (PDF-E) to learn from direction and norm in a multi-task learning framework, where 1 encoder is used to generate representation and 2 decoders are used to generating direction and norm, respectively. Further, we use reconstructing the input pose sequence as an additional constraint and present a generalized PDF network (PDF-G) to learn both motion and shape information, which achieves state-of-the-art performances on large-scale and challenging 3D action recognition datasets including the NTU RGB+D 60 dataset and NTU RGB+D 120 dataset.

Publisher

American Association for the Advancement of Science (AAAS)

Subject

Applied Mathematics,General Mathematics

Reference45 articles.

1. Action-stage emphasized spatiotemporal VLAD for video action recognition;Tu Z;IEEE Trans Image Process,2019

2. Meng H Pears N Bailey C. A human action recognition system for embedded computer vision application. Paper presented at: CVPR 2007. Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition; 2007 June 17–22; Minneapolis MN USA.

3. A survey on vision-based human action recognition;Poppe R.;Image Vis Comput,2010

4. Application research on optimization algorithm of sEMG gesture recognition based on light CNN+LSTM model;Bai D;Cyborg Bionic Syst,2021

5. Salient pairwise spatio-temporal interest points for real-time activity recognition;Liu M;CAAI Trans Intell Technol,2016

Cited by 23 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Monitoring and prediction of the LULC change dynamics using time series remote sensing data with Google Earth Engine;Physics and Chemistry of the Earth, Parts A/B/C;2024-12

2. Innovative Orthopedic Solutions for AI-Optimized Piezoelectric Implants for Superior Patient Care;Applied Sciences;2024-08-23

3. Explore human parsing modality for action recognition;CAAI Transactions on Intelligence Technology;2024-08-16

4. SFMVIT: Slowfast Meet VIT in Chaotic World;2024 IEEE International Conference on Multimedia and Expo Workshops (ICMEW);2024-07-15

5. Employing a combination of direct ink writing and near-infrared-induced photopolymerization facilitates 3D printing of unsupported multi-scale ceramics: The bone tissue engineering approach;European Polymer Journal;2024-06