Affiliation:
1. National University of Defense Technology
Abstract
With NVIDIA’s parallel computing architecture CUDA, using GPU to speed up compute-intensive applications has become a research focus in recent years. In this paper, we proposed a scalable method for multi-GPU system to accelerate motion estimation algorithm, which is the most time consuming process in video encoding. Based on the analysis of data dependency and multi-GPU architecture, a parallel computing model and a communication model are designed. We tested our parallel algorithm and analyzed the performance with 10 standard video sequences in different resolutions using 4 NVIDIA GTX460 GPUs, and calculated the overall speedup. Our results show that a speedup of 36.1 times using 1 GPU and more than 120 times for 4 GPUs on 1920x1080 sequences. Further, our parallel algorithm demonstrated the potential of nearly linear speedup according to the number of GPUs in the system.
Publisher
Trans Tech Publications, Ltd.
Reference8 articles.
1. Yu-Cheng Lin, Pei-Lun Li, Chin-Hsiang Chang, Chi-Ling Wu, You-Ming Tsao, and Shao-Yi Chien, Multi-Pass algorithm of motion estimation in video encoding for generic GPU, Proc. IEEE Symp. Circuits and Systems. Proceedings. 21-24 May 2006, pp.4451-4454.
2. Wei-Nien Chen and Hsueh-Ming Hang, H. 264/AVC motion estimation implementation on compute unified device architecture (CUDA), IEEE International Conference on Multimedia and Expo(ICME08), June 23 2008-April 26 2008 , pp.697-700.
3. Bart Pieters, Charles F. Hollemeersch, Peter Lambert, and Rik Van de Walle, Motion estimation for H. 264/AVC on multiple GPUs using NVIDIA CUDA, Applications of Digital Image Processing XXXII, August 2-5 2009, Vol. 7743 77430X-2.
4. Gan Xinbiao, Shen Li, and Wang Zhiying, Parallel full search algorithm for motion estimation using CUDA, Journal of Computer-Aided Design and Computer Graphics, vol. 22, Mar. 2010, pp.457-460.
5. Youngsub Ko, Youngmin Yi, and Soonhoi Ha, An efficient parallel motion estimation algorithm and X264 parallelization in CUDA, Design and Architectures for Signal and Image Processing(DASIP), 2-4 Nov. 2011, pp.1-8.