Unsupervised Learning of Human Action Categories in Still Images with Deep Representations-Reference-Cited by-同舟云学术

Unsupervised Learning of Human Action Categories in Still Images with Deep Representations

Published:2019-11-30 Issue:4 Volume:15 Page:1-20
ISSN:1551-6857
Container-title:ACM Transactions on Multimedia Computing, Communications, and Applications
language:en
Short-container-title:ACM Trans. Multimedia Comput. Commun. Appl.

Author:

Zheng Yunpeng¹,Li Xuelong²,Lu Xiaoqiang³^ORCID

Affiliation:

1. Key Laboratory of Spectral Imaging Technology CAS, Xi’an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences and the University of Chinese Academy of Sciences, Beijing, China

2. School of Computer Science and Center for OPTical IMagery Analysis and Learning<?brk fill?> (OPTIMAL), Northwestern Polytechnical University, Shaanxi, China

3. Key Laboratory of Spectral Imaging Technology CAS, Xi’an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Shaanxi, China

Abstract

In this article, we propose a novel method for unsupervised learning of human action categories in still images. In contrast to previous methods, the proposed method explores distinctive information of actions directly from unlabeled image databases, attempting to learn discriminative deep representations in an unsupervised manner to distinguish different actions. In the proposed method, action image collections can be used without manual annotations. Specifically, (i) to deal with the problem that unsupervised discriminative deep representations are difficult to learn, the proposed method builds a training dataset with surrogate labels from the unlabeled dataset, then learns discriminative representations by alternately updating convolutional neural network (CNN) parameters and the surrogate training dataset in an iterative manner; (ii) to explore the discriminatory information among different action categories, training batches for updating the CNN parameters are built with triplet groups and the triplet loss function is introduced to update the CNN parameters; and (iii) to learn more discriminative deep representations, a Random Forest classifier is adopted to update the surrogate training dataset, and more beneficial triplet groups then can be built with the updated surrogate training dataset. Extensive experiments on four benchmark datasets demonstrate the effectiveness of the proposed method.

Funder

Key Research Program of Frontier Sciences, CAS

National Key R8D Program of China

CAS “Light of West China” Program

National Natural Science Foundation of China

Young Top-notch Talent Program of Chinese Academy of Sciences

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Networks and Communications,Hardware and Architecture

Link

https://dl.acm.org/doi/pdf/10.1145/3362161

Reference74 articles.

1. Ensemble of Deep Models for Event Recognition

2. Deep Unsupervised Similarity Learning Using Partially Ordered Sets

3. Miguel Ángel Bautista Artsiom Sanakoyeu Ekaterina Tikhoncheva and Björn Ommer. 2016. CliqueCNN: Deep unsupervised exemplar learning. In Advances in Neural Information Processing Systems. NIPSF 3846--3854. Miguel Ángel Bautista Artsiom Sanakoyeu Ekaterina Tikhoncheva and Björn Ommer. 2016. CliqueCNN: Deep unsupervised exemplar learning. In Advances in Neural Information Processing Systems. NIPSF 3846--3854.

4. Image Classification using Random Forests and Ferns

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. An efficient Meta-VSW method for ship behaviors recognition and application;Ocean Engineering;2024-11

2. Relation with Free Objects for Action Recognition;ACM Transactions on Multimedia Computing, Communications, and Applications;2023-10-18

3. SSRT: A Sequential Skeleton RGB Transformer to Recognize Fine-Grained Human-Object Interactions and Action Recognition;IEEE Access;2023

4. UNSUPERVISED MACHINE LEARNING ALGORITHM TO SOLVE KNIGHT COVERING PROBLEM FOR 6 BY 6 BOARD;Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi;2021-10-25

5. Human behaviour recognition with mid‐level representations for crowd understanding and analysis;IET Image Processing;2021-02-25