An efficient implementation of one-dimensional discrete wavelet transform algorithms for GPU architectures-Reference-Cited by-同舟云学术

An efficient implementation of one-dimensional discrete wavelet transform algorithms for GPU architectures

Published:2022-02-14 Issue:9 Volume:78 Page:11539-11563
ISSN:0920-8542
Container-title:The Journal of Supercomputing
language:en
Short-container-title:J Supercomput

Author:

Stokfiszewski Kamil^ORCID,Wieloch Kamil,Yatsymirskyy Mykhaylo

Abstract

AbstractIn this paper, the authors present several self-developed implementation variants of the Discrete Wavelet Transform (DWT) computation algorithms and compare their execution times against the commonly approved ones for representative modern Graphics Processing Units (GPUs) architectures. The proposed solutions avoid the time-consuming modulo divisions and conditional instructions used for DWT filters wrapping by proper expansion of the DWTs input data vectors. The main goal of the research is to improve the computation times for popular DWT algorithms for representative modern GPU architectures while retaining the code’s clarity and simplicity. The relations between algorithms execution time improvements for GPUs are also compared with their counterparts for traditional sequential processors. The experimental study shows that the proposed implementations, in the case of parallel realization on GPUs, are characterized by very simple kernel code and high execution time performance.

Publisher

Springer Science and Business Media LLC

Subject

Hardware and Architecture,Information Systems,Theoretical Computer Science,Software

Link

https://link.springer.com/content/pdf/10.1007/s11227-022-04331-8.pdf

Reference40 articles.

1. Porwik P (2015) Wybrane metody cyfrowego przetwarzania sygnalow z przykladami programow w Matlabie. Wydawnictwo Uniwersytetu Slaskiego, Katowice

2. Sorensen H (2012) “High-Performance Matrix-Vector Multiplication on the GPU”, M. Alexander et al. (Eds) Euro-Par 2011 Parallel Processing Workshops, Lecture Notes in Computer Science, vol 7155. Springer, Berlin, Heidelberg

3. Yatsymirskyy M, Stokfiszewski K (2012) “Effectiveness of lattice factorization of two-channel orthogonal filter banks”, 2012 Joint Conference New Trends In Audio, Video And Signal Processing: Algorithms. Architectures, Arrangements And Applications (NTAV/SPA), pp 275–279

4. Sanders J, Kandrot E (2010) CUDA by example: an introduction to general-purpose GPU programming. Addison-Wesley Professional, USA (ISBN: 978-0-13-138768-3)

5. Cheng J, Grossman M, McKercher T (2014) Professional CUDA C programming. John Wiley & Sons, Inc., Indianapolis (ISBN: 978-1-118-73932-7)

Cited by 8 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Privacy-preserving human activity recognition using principal component-based wavelet CNN;Signal, Image and Video Processing;2024-09-02

2. A Novel Low-Complexity and Parallel Algorithm for DCT IV Transform and Its GPU Implementation;Applied Sciences;2024-08-24

3. Fine-tuning inflow prediction models: integrating optimization algorithms and TRMM data for enhanced accuracy;Water Science & Technology;2024-07-03

4. FPGA implementation of compact and low-power multiplierless architectures for DWT and IDWT;Journal of Real-Time Image Processing;2024-01-07

5. Machine learning based electrocardiogram peaks analyzer for Wolff-Parkinson-White syndrome;Biomedical Signal Processing and Control;2023-09