Efficient Parallel Video Processing Techniques on GPU: From Framework to Implementation-Reference-Cited by-同舟云学术

Efficient Parallel Video Processing Techniques on GPU: From Framework to Implementation

Published:2014 Issue: Volume:2014 Page:1-19
ISSN:2356-6140
Container-title:The Scientific World Journal
language:en
Short-container-title:The Scientific World Journal

Author:

Su Huayou¹,Wen Mei¹,Wu Nan¹,Ren Ju¹,Zhang Chunyuan¹

Affiliation:

1. School of Computer Science and Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Changsha, Hunan 410073, China

Abstract

Through reorganizing the execution order and optimizing the data structure, we proposed an efficient parallel framework for H.264/AVC encoder based on massively parallel architecture. We implemented the proposed framework by CUDA on NVIDIA’s GPU. Not only the compute intensive components of the H.264 encoder are parallelized but also the control intensive components are realized effectively, such as CAVLC and deblocking filter. In addition, we proposed serial optimization methods, including the multiresolution multiwindow for motion estimation, multilevel parallel strategy to enhance the parallelism of intracoding as much as possible, component-based parallel CAVLC, and direction-priority deblocking filter. More than 96% of workload of H.264 encoder is offloaded to GPU. Experimental results show that the parallel implementation outperforms the serial program by 20 times of speedup ratio and satisfies the requirement of the real-time HD encoding of 30 fps. The loss of PSNR is from 0.14 dB to 0.77 dB, when keeping the same bitrate. Through the analysis to the kernels, we found that speedup ratios of the compute intensive algorithms are proportional with the computation power of the GPU. However, the performance of the control intensive parts (CAVLC) is much related to the memory bandwidth, which gives an insight for new architecture design.

Funder

National Natural Science Foundation of China

Publisher

Hindawi Limited

Subject

General Environmental Science,General Biochemistry, Genetics and Molecular Biology,General Medicine

Link

http://downloads.hindawi.com/journals/tswj/2014/716020.pdf

Reference12 articles.

1. Analysis and architecture design of an HDTV720p 30 frames/s H.264/AVC encoder

2. Highly Parallel Rate-Distortion Optimized Intra-Mode Decision on Multicore Graphics Processors

3. Overview of the H.264/AVC video coding standard

4. Low-complexity transform and quantization in H.264/AVC

5. Lecture Notes in Computer Science,2007

Cited by 12 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. GVLE: a highly optimized GPU-based implementation of variable-length encoding;The Journal of Supercomputing;2022-12-18

2. CAVLCU: an efficient GPU-based implementation of CAVLC;The Journal of Supercomputing;2021-11-29

3. Accelerating video encoding using cluster computing;Multimedia Tools and Applications;2020-02-18

4. Overview of Research in the field of Video Compression using Deep Neural Networks;Multimedia Tools and Applications;2020-01-07

5. Intra prediction with deep learning;Applications of Digital Image Processing XLI;2018-09-17