Semantic Adversarial Network with Multi-Scale Pyramid Attention for Video Classification-Reference-Cited by-同舟云学术

Semantic Adversarial Network with Multi-Scale Pyramid Attention for Video Classification

Published:2019-07-17 Issue: Volume:33 Page:9030-9037
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Xie De,Deng Cheng,Wang Hao,Li Chao,Tao Dapeng

Abstract

Two-stream architecture have shown strong performance in video classification task. The key idea is to learn spatiotemporal features by fusing convolutional networks spatially and temporally. However, there are some problems within such architecture. First, it relies on optical flow to model temporal information, which are often expensive to compute and store. Second, it has limited ability to capture details and local context information for video data. Third, it lacks explicit semantic guidance that greatly decrease the classification performance. In this paper, we proposed a new two-stream based deep framework for video classification to discover spatial and temporal information only from RGB frames, moreover, the multi-scale pyramid attention (MPA) layer and the semantic adversarial learning (SAL) module is introduced and integrated in our framework. The MPA enables the network capturing global and local feature to generate a comprehensive representation for video, and the SAL can make this representation gradually approximate to the real video semantics in an adversarial manner. Experimental results on two public benchmarks demonstrate our proposed methods achieves state-of-the-art results on standard video datasets.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A2SN: attention based two stream network for sports video classification;Multimedia Tools and Applications;2024-02-08

2. Rich Action-Semantic Consistent Knowledge for Early Action Prediction;IEEE Transactions on Image Processing;2024

3. Thermal Infrared Image Colorization for Nighttime Driving Scenes With Top-Down Guided Attention;IEEE Transactions on Intelligent Transportation Systems;2022-09

4. QuasiVSD: efficient dual-frame smoke detection;Neural Computing and Applications;2022-02-16

5. Object-Agnostic Transformers for Video Referring Segmentation;IEEE Transactions on Image Processing;2022