Improving Power of DSP and CNN Hardware Accelerators Using Approximate Floating-point Multipliers-Reference-Cited by-同舟云学术

Improving Power of DSP and CNN Hardware Accelerators Using Approximate Floating-point Multipliers

Published:2021-09-30 Issue:5 Volume:20 Page:1-21
ISSN:1539-9087
Container-title:ACM Transactions on Embedded Computing Systems
language:en
Short-container-title:ACM Trans. Embed. Comput. Syst.

Author:

Leon Vasileios¹^ORCID,Paparouni Theodora¹,Petrongonas Evangelos¹,Soudris Dimitrios¹,Pekmestzi Kiamal¹

Affiliation:

1. National Technical University of Athens, Athens, Greece

Abstract

Approximate computing has emerged as a promising design alternative for delivering power-efficient systems and circuits by exploiting the inherent error resiliency of numerous applications. The current article aims to tackle the increased hardware cost of floating-point multiplication units, which prohibits their usage in embedded computing. We introduce AFMU (Approximate Floating-point MUltiplier), an area/power-efficient family of multipliers, which apply two approximation techniques in the resource-hungry mantissa multiplication and can be seamlessly extended to support dynamic configuration of the approximation levels via gating signals. AFMU offers large accuracy configuration margins, provides negligible logic overhead for dynamic configuration, and detects unexpected results that may arise due to the approximations. Our evaluation shows that AFMU delivers energy gains in the range 3.6%–53.5% for half-precision and 37.2%–82.4% for single-precision, in exchange for mean relative error around 0.05%–3.33% and 0.01%–2.20%, respectively. In comparison with state-of-the-art multipliers, AFMU exhibits up to 4–6× smaller error on average while delivering more energy-efficient computing. The evaluation in image processing shows that AFMU provides sufficient quality of service, i.e., more than 50 db PSNR and near 1 SSIM values, and up to 57.4% power reduction. When used in floating-point CNNs, the accuracy loss is small (or zero), i.e., up to 5.4% for MNIST and CIFAR-10, in exchange for up to 63.8% power gain.

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture,Software

Link

https://dl.acm.org/doi/pdf/10.1145/3448980

Reference49 articles.

1. Dual-quality 4:2 compressors for utilizing in dynamic accuracy configurable multipliers;Akbari Omid;IEEE Trans. Very Large Scale Integ. Syst.,2017

2. Rodinia: A benchmark suite for heterogeneous computing

3. Analysis and characterization of inherent application resilience for approximate computing

Cited by 17 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Design of a Hardware-Efficient Floating-Point Multiplier with Dynamic Segmentation;2024 19th Conference on Ph.D Research in Microelectronics and Electronics (PRIME);2024-06-09

2. Low-Power High Precision Floating-Point Divider With Bidimensional Linear Approximation;IEEE Transactions on Circuits and Systems I: Regular Papers;2024

3. Anti-Rounding Image Steganography With Separable Fine-Tuned Network;IEEE Transactions on Circuits and Systems for Video Technology;2023-11

4. Hardware Acceleration Schemes for Convolutional Neural Networks;Journal of Physics: Conference Series;2023-11-01

5. MDCIM: MRAM-Based Digital Computing-in-Memory Macro for Floating-Point Computation with High Energy Efficiency and Low Area Overhead;Applied Sciences;2023-10-31