Deep Learning Approach for Human Action Recognition Using a Time Saliency Map Based on Motion Features Considering Camera Movement and Shot in Video Image Sequences-Reference-Cited by-同舟云学术

Deep Learning Approach for Human Action Recognition Using a Time Saliency Map Based on Motion Features Considering Camera Movement and Shot in Video Image Sequences

Published:2023-11-15 Issue:11 Volume:14 Page:616
ISSN:2078-2489
Container-title:Information
language:en
Short-container-title:Information

Author:

Alavigharahbagh Abdorreza¹^ORCID,Hajihashemi Vahid¹^ORCID,Machado José J. M.²^ORCID,Tavares João Manuel R. S.²^ORCID

Affiliation:

1. Faculdade de Engenharia, Universidade do Porto, Rua Dr. Roberto Frias, s/n, 4200-465 Porto, Portugal

2. Departamento de Engenharia Mecânica, Faculdade de Engenharia, Universidade do Porto, Rua Dr. Roberto Frias, s/n, 4200-465 Porto, Portugal

Abstract

In this article, a hierarchical method for action recognition based on temporal and spatial features is proposed. In current HAR methods, camera movement, sensor movement, sudden scene changes, and scene movement can increase motion feature errors and decrease accuracy. Another important aspect to take into account in a HAR method is the required computational cost. The proposed method provides a preprocessing step to address these challenges. As a preprocessing step, the method uses optical flow to detect camera movements and shots in input video image sequences. In the temporal processing block, the optical flow technique is combined with the absolute value of frame differences to obtain a time saliency map. The detection of shots, cancellation of camera movement, and the building of a time saliency map minimise movement detection errors. The time saliency map is then passed to the spatial processing block to segment the moving persons and/or objects in the scene. Because the search region for spatial processing is limited based on the temporal processing results, the computations in the spatial domain are drastically reduced. In the spatial processing block, the scene foreground is extracted in three steps: silhouette extraction, active contour segmentation, and colour segmentation. Key points are selected at the borders of the segmented foreground. The last used features are the intensity and angle of the optical flow of detected key points. Using key point features for action detection reduces the computational cost of the classification step and the required training time. Finally, the features are submitted to a Recurrent Neural Network (RNN) to recognise the involved action. The proposed method was tested using four well-known action datasets: KTH, Weizmann, HMDB51, and UCF101 datasets and its efficiency was evaluated. Since the proposed approach segments salient objects based on motion, edges, and colour features, it can be added as a preprocessing step to most current HAR systems to improve performance.

Publisher

MDPI AG

Subject

Information Systems

Link

https://www.mdpi.com/2078-2489/14/11/616/pdf

Reference137 articles.

1. Caetano, C., dos Santos, J.A., and Schwartz, W.R. (2016, January 4–8). Optical Flow Co-occurrence Matrices: A novel spatiotemporal feature descriptor. Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR), Cancun, Mexico.

2. Gupta, A., and Balan, M.S. (2018, January 1). Action recognition from optical flow visualizations. Proceedings of the 2nd International Conference on Computer Vision & Image Processing, Roorkee, India.

3. Kumar, S.S., and John, M. (2016, January 24–27). Human activity recognition using optical flow based feature set. Proceedings of the 2016 IEEE International Carnahan Conference on Security Technology (ICCST), Orlando, FL, USA.

4. Action representation and recognition through temporal co-occurrence of flow fields and convolutional neural networks;Rashwan;Multimed. Tools Appl.,2020

5. Gait representation and recognition from temporal co-occurrence of flow fields;Rashwan;Mach. Vis. Appl.,2019

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Enhanced human motion detection with hybrid RDA-WOA-based RNN and multiple hypothesis tracking for occlusion handling;Image and Vision Computing;2024-10

2. Hybrid time-spatial video saliency detection method to enhance human action recognition systems;Multimedia Tools and Applications;2024-02-14

3. Abnormal Action Recognition in Social Media Clips Using Deep Learning to Analyze Behavioral Change;Lecture Notes in Networks and Systems;2024