TBRNet: Two-Stream BiLSTM Residual Network for Video Action Recognition-Reference-Cited by-同舟云学术

TBRNet: Two-Stream BiLSTM Residual Network for Video Action Recognition

Published:2020-07-15 Issue:7 Volume:13 Page:169
ISSN:1999-4893
Container-title:Algorithms
language:en
Short-container-title:Algorithms

Author:

Wu Xiao^ORCID,Ji Qingge^ORCID

Abstract

Modeling spatiotemporal representations is one of the most essential yet challenging issues in video action recognition. Existing methods lack the capacity to accurately model either the correlations between spatial and temporal features or the global temporal dependencies. Inspired by the two-stream network for video action recognition, we propose an encoder–decoder framework named Two-Stream Bidirectional Long Short-Term Memory (LSTM) Residual Network (TBRNet) which takes advantage of the interaction between spatiotemporal representations and global temporal dependencies. In the encoding phase, the two-stream architecture, based on the proposed Residual Convolutional 3D (Res-C3D) network, extracts features with residual connections inserted between the two pathways, and then the features are fused to become the short-term spatiotemporal features of the encoder. In the decoding phase, those short-term spatiotemporal features are first fed into a temporal attention-based bidirectional LSTM (BiLSTM) network to obtain long-term bidirectional attention-pooling dependencies. Subsequently, those temporal dependencies are integrated with short-term spatiotemporal features to obtain global spatiotemporal relationships. On two benchmark datasets, UCF101 and HMDB51, we verified the effectiveness of our proposed TBRNet by a series of experiments, and it achieved competitive or even better results compared with existing state-of-the-art approaches.

Publisher

MDPI AG

Subject

Computational Mathematics,Computational Theory and Mathematics,Numerical Analysis,Theoretical Computer Science

Link

https://www.mdpi.com/1999-4893/13/7/169/pdf

Reference45 articles.

1. Action Recognition in Video Sequences using Deep Bi-Directional LSTM With CNN Features

Cited by 8 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. R-CNN Network for Swim Athlete Action Recognition: A Deep Learning Method;Smart Innovation, Systems and Technologies;2024

2. Self-Attention Pooling-Based Long-Term Temporal Network for Action Recognition;IEEE Transactions on Cognitive and Developmental Systems;2023-03

3. Bidirectional Long Short-Term Memory with Temporal Dense Sampling for human action recognition;Expert Systems with Applications;2022-12

4. Human Activity Recognition Using LSTM with Feature Extraction Through CNN;Lecture Notes in Networks and Systems;2022-07-06

5. Video Tactical Intelligence Analysis Method of Karate Competition Based on Convolutional Neural Network;Discrete Dynamics in Nature and Society;2022-05-26