Part Aware Contrastive Learning for Self-Supervised Action Recognition-Reference-Cited by-同舟云学术

Part Aware Contrastive Learning for Self-Supervised Action Recognition

Published:2023-08 Issue: Volume: Page:
ISSN:
Container-title:Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence
language:
Short-container-title:

Author:

Hua Yilei¹,Wu Wenhan²,Zheng Ce³,Lu Aidong²,Liu Mengyuan⁴,Chen Chen³,Wu Shiqian¹

Affiliation:

1. School of Information Science and Engineering, Wuhan University of Science and Technology

2. University of North Carolina at Charlotte

3. Center for Research in Computer Vision,University of Central Florida

4. Peking University, Shenzhen Graduate School

Abstract

In recent years, remarkable results have been achieved in self-supervised action recognition using skeleton sequences with contrastive learning. It has been observed that the semantic distinction of human action features is often represented by local body parts, such as legs or hands, which are advantageous for skeleton-based action recognition. This paper proposes an attention-based contrastive learning framework for skeleton representation learning, called SkeAttnCLR, which integrates local similarity and global features for skeleton-based action representations. To achieve this, a multi-head attention mask module is employed to learn the soft attention mask features from the skeletons, suppressing non-salient local features while accentuating local salient features, thereby bringing similar local features closer in the feature space. Additionally, ample contrastive pairs are generated by expanding contrastive pairs based on salient and non-salient features with global features, which guide the network to learn the semantic representations of the entire skeleton. Therefore, with the attention mask mechanism, SkeAttnCLR learns local features under different data augmentation views. The experiment results demonstrate that the inclusion of local feature similarity significantly enhances skeleton-based action representation. Our proposed SkeAttnCLR outperforms state-of-the-art methods on NTURGB+D, NTU120-RGB+D, and PKU-MMD datasets. The code and settings are available at this repository: https://github.com/GitHubOfHyl97/SkeAttnCLR.

Publisher

International Joint Conferences on Artificial Intelligence Organization

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A lightweight attention-driven distillation model for human pose estimation;Pattern Recognition Letters;2024-09

2. Intelligent Surveillance of Airport Apron: Detection and Location of Abnormal Behavior in Typical Non-Cooperative Human Objects;Applied Sciences;2024-07-16

3. Edge-Joint Assisted and Salient Enhanced Self-Supervised Action Recognition;2024 IEEE 14th International Conference on Electronics Information and Emergency Communication (ICEIEC);2024-05-24

4. Self-supervised action representation learning from partial consistency skeleton sequences;Neural Computing and Applications;2024-04-21

5. LAMP: Leveraging Language Prompts for Multi-Person Pose Estimation;2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS);2023-10-01