Learning human activities and object affordances from RGB-D videos-Reference-Cited by-同舟云学术

Learning human activities and object affordances from RGB-D videos

Published:2013-07 Issue:8 Volume:32 Page:951-970
ISSN:0278-3649
Container-title:The International Journal of Robotics Research
language:en
Short-container-title:The International Journal of Robotics Research

Author:

Koppula Hema Swetha¹,Gupta Rudhir¹,Saxena Ashutosh¹

Affiliation:

1. Department of Computer Science, Cornell University, USA

Abstract

Understanding human activities and object affordances are two very important skills, especially for personal robots which operate in human environments. In this work, we consider the problem of extracting a descriptive labeling of the sequence of sub-activities being performed by a human, and more importantly, of their interactions with the objects in the form of associated affordances. Given a RGB-D video, we jointly model the human activities and object affordances as a Markov random field where the nodes represent objects and sub-activities, and the edges represent the relationships between object affordances, their relations with sub-activities, and their evolution over time. We formulate the learning problem using a structural support vector machine (SSVM) approach, where labelings over various alternate temporal segmentations are considered as latent variables. We tested our method on a challenging dataset comprising 120 activity videos collected from 4 subjects, and obtained an accuracy of 79.4% for affordance, 63.4% for sub-activity and 75.0% for high-level activity labeling. We then demonstrate the use of such descriptive labeling in performing assistive tasks by a PR2 robot.

Publisher

SAGE Publications

Subject

Applied Mathematics,Artificial Intelligence,Electrical and Electronic Engineering,Mechanical Engineering,Modelling and Simulation,Software

Link

http://journals.sagepub.com/doi/pdf/10.1177/0278364913478446

Reference72 articles.

1. Human activity analysis

2. Learning the semantics of object–action relations by observation

Cited by 406 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Deep scene understanding with extended text description for human object interaction detection;Expert Systems with Applications;2025-01

2. Review on synergizing the Metaverse and AI-driven synthetic data: enhancing virtual realms and activity recognition in computer vision;Visual Intelligence;2024-09-09

3. Deep learning for computer vision based activity recognition and fall detection of the elderly: a systematic review;Applied Intelligence;2024-07-08

4. Multimodal vision-based human action recognition using deep learning: a review;Artificial Intelligence Review;2024-06-19

5. Human Activity Recognition Through Images Using a Deep Learning Approach;2024-05-30