GAT-Based Bi-CARU with Adaptive Feature-Based Transformation for Video Summarisation-Reference-Cited by-同舟云学术

GAT-Based Bi-CARU with Adaptive Feature-Based Transformation for Video Summarisation

Published:2024-08-05 Issue:8 Volume:12 Page:126
ISSN:2227-7080
Container-title:Technologies
language:en
Short-container-title:Technologies

Author:

Chan Ka-Hou¹²^ORCID,Im Sio-Kei¹²^ORCID

Affiliation:

1. Faculty of Applied Sciences, Macao Polytechnic University, Macau, China

2. Engineering Research Centre of Applied Technology on Machine Translation and Artificial Intelligence, Macao Polytechnic University, Macau, China

Abstract

Nowadays, video is a common social media in our lives. Video summarisation has become an interesting task for information extraction, where the challenge of high redundancy of key scenes leads to difficulties in retrieving important messages. To address this challenge, this work presents a novel approach called the Graph Attention (GAT)-based bi-directional content-adaptive recurrent unit model for video summarisation. The model makes use of the graph attention approach to transform the visual features of interesting scene(s) from a video. This transformation is achieved by a mechanism called Adaptive Feature-based Transformation (AFT), which extracts the visual features and elevates them to a higher-level representation. We also introduce a new GAT-based attention model that extracts major features from weight features for information extraction, taking into account the tendency of humans to pay attention to transformations and moving objects. Additionally, we integrate the higher-level visual features obtained from the attention layer with the semantic features processed by Bi-CARU. By combining both visual and semantic information, the proposed work enhances the accuracy of key-scene determination. By addressing the issue of high redundancy among major information and using advanced techniques, our method provides a competitive and efficient way to summarise videos. Experimental results show that our approach outperforms existing state-of-the-art methods in video summarisation.

Funder

Macao Polytechnic University

Publisher

MDPI AG

Link

https://www.mdpi.com/2227-7080/12/8/126/pdf

Reference55 articles.

1. Attention, please! A survey of neural attention models in deep learning;Colombini;Artif. Intell. Rev.,2022

2. Video Summarization With Attention-Based Encoder–Decoder Networks;Ji;IEEE Trans. Circuits Syst. Video Technol.,2020

3. A review on the attention mechanism of deep learning;Niu;Neurocomputing,2021

4. Attention mechanisms in computer vision: A survey;Guo;Comput. Vis. Media,2022

5. Deep Semantic and Attentive Network for Unsupervised Video Summarization;Zhong;ACM Trans. Multimed. Comput. Commun. Appl.,2022