Scalable Parallel Motion Estimation on Muti-GPU System-Reference-Cited by-同舟云学术

Scalable Parallel Motion Estimation on Muti-GPU System

Published:2013-08 Issue: Volume:347-350 Page:3708-3714
ISSN:1662-7482
Container-title:Applied Mechanics and Materials
language:
Short-container-title:AMM

Author:

Chen Dong¹,Su Hua You¹,Mei Wen¹,Wang Li Xuan¹,Zhang Chun Yuan¹

Affiliation:

1. National University of Defense Technology

Abstract

With NVIDIA’s parallel computing architecture CUDA, using GPU to speed up compute-intensive applications has become a research focus in recent years. In this paper, we proposed a scalable method for multi-GPU system to accelerate motion estimation algorithm, which is the most time consuming process in video encoding. Based on the analysis of data dependency and multi-GPU architecture, a parallel computing model and a communication model are designed. We tested our parallel algorithm and analyzed the performance with 10 standard video sequences in different resolutions using 4 NVIDIA GTX460 GPUs, and calculated the overall speedup. Our results show that a speedup of 36.1 times using 1 GPU and more than 120 times for 4 GPUs on 1920x1080 sequences. Further, our parallel algorithm demonstrated the potential of nearly linear speedup according to the number of GPUs in the system.

Publisher

Trans Tech Publications, Ltd.

Link

https://www.scientific.net/AMM.347-350.3708.pdf

Reference8 articles.

1. Yu-Cheng Lin, Pei-Lun Li, Chin-Hsiang Chang, Chi-Ling Wu, You-Ming Tsao, and Shao-Yi Chien, Multi-Pass algorithm of motion estimation in video encoding for generic GPU, Proc. IEEE Symp. Circuits and Systems. Proceedings. 21-24 May 2006, pp.4451-4454.

2. Wei-Nien Chen and Hsueh-Ming Hang, H. 264/AVC motion estimation implementation on compute unified device architecture (CUDA), IEEE International Conference on Multimedia and Expo(ICME08), June 23 2008-April 26 2008 , pp.697-700.

3. Bart Pieters, Charles F. Hollemeersch, Peter Lambert, and Rik Van de Walle, Motion estimation for H. 264/AVC on multiple GPUs using NVIDIA CUDA, Applications of Digital Image Processing XXXII, August 2-5 2009, Vol. 7743 77430X-2.

4. Gan Xinbiao, Shen Li, and Wang Zhiying, Parallel full search algorithm for motion estimation using CUDA, Journal of Computer-Aided Design and Computer Graphics, vol. 22, Mar. 2010, pp.457-460.

5. Youngsub Ko, Youngmin Yi, and Soonhoi Ha, An efficient parallel motion estimation algorithm and X264 parallelization in CUDA, Design and Architectures for Signal and Image Processing(DASIP), 2-4 Nov. 2011, pp.1-8.