Cooperative Training of Deep Aggregation Networks for RGB-D Action Recognition-Reference-Cited by-同舟云学术

Cooperative Training of Deep Aggregation Networks for RGB-D Action Recognition

Published:2018-04-27 Issue:1 Volume:32 Page:
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Wang Pichao,Li Wanqing,Wan Jun,Ogunbona Philip,Liu Xinwang

Abstract

A novel deep neural network training paradigm that exploits the conjoint information in multiple heterogeneous sources is proposed. Specifically, in a RGB-D based action recognition task, it cooperatively trains a single convolutional neural network (named c-ConvNet) on both RGB visual features and depth features, and deeply aggregates the two kinds of features for action recognition. Differently from the conventional ConvNet that learns the deep separable features for homogeneous modality-based classification with only one softmax loss function, the c-ConvNet enhances the discriminative power of the deeply learned features and weakens the undesired modality discrepancy by jointly optimizing a ranking loss and a softmax loss for both homogeneous and heterogeneous modalities. The ranking loss consists of intra-modality and cross-modality triplet losses, and it reduces both the intra-modality and cross-modality feature variations. Furthermore, the correlations between RGB and depth data are embedded in the c-ConvNet, and can be retrieved by either of the modalities and contribute to the recognition in the case even only one of the modalities is available. The proposed method was extensively evaluated on two large RGB-D action recognition datasets, ChaLearn LAP IsoGD and NTU RGB+D datasets, and one small dataset, SYSU 3D HOI, and achieved state-of-the-art results.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 32 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. DFN: A deep fusion network for flexible single and multi-modal action recognition;Expert Systems with Applications;2024-07

2. A multi-modal framework for continuous and isolated hand gesture recognition utilizing movement epenthesis detection;Machine Vision and Applications;2024-06-27

3. Multimodal vision-based human action recognition using deep learning: a review;Artificial Intelligence Review;2024-06-19

4. Comparison Analysis of Multimodal Fusion for Dangerous Action Recognition in Railway Construction Sites;Electronics;2024-06-12

5. Multimodal action recognition: a comprehensive survey on temporal modeling;Multimedia Tools and Applications;2023-12-22