Transformer-Based Data-Driven Video Coding Acceleration for Industrial Applications-Reference-Cited by-同舟云学术

Transformer-Based Data-Driven Video Coding Acceleration for Industrial Applications

Published:2022-09-27 Issue: Volume:2022 Page:1-11
ISSN:1563-5147
Container-title:Mathematical Problems in Engineering
language:en
Short-container-title:Mathematical Problems in Engineering

Author:

Li Yixiao¹²,Li Lixiang¹²^ORCID,Zhuang Zirui³,Fang Yuan¹²,Peng Haipeng¹²,Ling Nam⁴

Affiliation:

1. Information Security Center, State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China

2. National Engineering Laboratory for Disaster Backup and Recovery, Beijing University of Posts and Telecommunications, Beijing 100876, China

3. State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China

4. Department of Computer Science and Engineering, Santa Clara University, 95053 Santa Clara, USA

Abstract

With the exploding development of edge intelligence and smart industry, deep learning-based intelligent industrial solutions are promptly applied in the manufacturing process. Many intelligent industrial solutions such as automatic manufacturing inspection are computer vision based and require fast and efficient video encoding techniques so that video streams can be processed as quickly as possible either at the edge cluster or over the cloud. As one of the most popular video coding standards, the high efficiency video coding (HEVC) standard has been applied to various industrial scenes. However, HEVC brings not only a higher compression rate but also a significant increase in encoding complexity, which hinders its practical application in industrial scenarios. Fortunately, a large amount of video coding data makes it possible to accelerate the encoding process in the industry. To speed up the video coding process in some industrial scenes, this paper proposes a data-driven fast approach for coding tree unit (CTU) partitioning in HEVC intracoding. First, we propose a method to represent the partition result of a CTU as a column vector of length 21. Then, we employ lots of encoding data produced in normal industry scenes to train transformer models used to predict the partitioning vector of the CTU. Finally, the final partitioning structure of the CTU is generated from the partitioning vector after a postprocessing operation and used by an industrial encoder. Compared with the original HEVC encoder used by some industrial applications, experiment results show that our approach achieves 58.77% encoding time reduction with 3.9% bit rate loss, which indicates that our data-driven approach for video coding has great capacity working in industrial applications.

Funder

National Basic Research Program of China

Publisher

Hindawi Limited

Subject

General Engineering,General Mathematics

Link

http://downloads.hindawi.com/journals/mpe/2022/1440323.pdf

Reference32 articles.

1. A Smart Factory in a Smart City: Virtual and Augmented Reality in a Smart Assembly Line

2. Automatic Detection of Wind Turbine Blade Surface Cracks Based on UAV-Taken Images

3. An Improved Encoder–Decoder Network for Ore Image Segmentation

4. Industrial applications of UHD video coding with an optimized super-SAO framework;M. Wang;IEEE Transactions on Industrial Informatics,2020

5. A selective encryption for H.264/AVC videos based on scrambling