Effective Video Summarization Using Channel Attention-Assisted Encoder–Decoder Framework

Author:

Alharbi Faisal1,Habib Shabana2ORCID,Albattah Waleed2ORCID,Jan Zahoor3,Alanazi Meshari D.4ORCID,Islam Muhammad5ORCID

Affiliation:

1. Quantum Technologies and Advanced Computing Institute, King Abdulaziz City for Science and Technology, Riyadh 11442, Saudi Arabia

2. Department of Information Technology, College of Computer, Qassim University, Buraydah 51452, Saudi Arabia

3. Department of Computer Science, Islamia College Peshawar, Peshawar 25000, Pakistan

4. Department of Electrical Engineering, College of Engineering, Jouf University, Sakaka 72388, Saudi Arabia

5. Department of Electrical Engineering, College of Engineering, Qassim University, Buraydah 52571, Saudi Arabia

Abstract

A significant number of cameras regularly generate massive amounts of data, demanding hardware, time, and labor resources to acquire, process, and monitor. Asymmetric frames within videos pose a challenge to automatic summarization of videos, making it challenging to capture key content. Developments in computer vision have accelerated the seamless capture and analysis of high-resolution video content. Video summarization (VS) has garnered considerable interest due to its ability to provide concise summaries of lengthy videos. The current literature mainly relies on a reduced set of representative features implemented using shallow sequential networks. Therefore, this work utilizes an optimal feature-assisted visual intelligence framework for representative feature selection and summarization. Initially, the empirical analysis of several features is performed, and ultimately, we adopt a fine-tuning InceptionV3 backbone for feature extraction, deviating from conventional approaches. Secondly, our strategic encoder–decoder module captures complex relationships with five convolutional blocks and two convolution transpose blocks. Thirdly, we introduced a channel attention mechanism, illuminating interrelations between channels and prioritizing essential patterns to grasp complex refinement features for final summary generation. Additionally, comprehensive experiments and ablation studies validate our framework’s exceptional performance, consistently surpassing state-of-the-art networks on two benchmarks (TVSum and SumMe) datasets.

Publisher

MDPI AG

Reference82 articles.

1. Visualizing the hotspots and emerging trends of multimedia big data through scientometrics;Jin;Multimed. Tools Appl.,2019

2. Optimal volumetric video streaming with hybrid saliency based tiling;Li;IEEE Trans. Multimed.,2022

3. Digital video summarization techniques: A survey;Workie;Int. J. Eng. Technol.,2020

4. Khan, H., Huy, B.Q., Abidin, Z.U., Yoo, J., Lee, M., Seo, K.W., Hwang, D.Y., Lee, M.Y., and Suhr, J.K. (2023, January 20–23). A modified yolov4 network with medium-scale challenging benchmark for efficient animal detection. Proceedings of the 9th International Conference on Next Generation Computing, Danang, Vietnam.

5. Khan, H., Haq, I.U., Munsif, M., Khan, S.U., and Lee, M.Y. (2022). Automated wheat diseases classification framework using advanced machine learning technique. Agriculture, 12.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3