Algorithms for Efficient Reproducible Floating Point Summation

Author:

Ahrens Peter1ORCID,Demmel James2,Nguyen Hong Diep2

Affiliation:

1. Massachusetts Institute of Technology, Cambridge, MA, USA

2. University of California Berkeley, Berkeley, CA, USA

Abstract

We define “reproducibility” as getting bitwise identical results from multiple runs of the same program, perhaps with different hardware resources or other changes that should not affect the answer. Many users depend on reproducibility for debugging or correctness. However, dynamic scheduling of parallel computing resources, combined with nonassociative floating point addition, makes reproducibility challenging even for summation, or operations like the BLAS. We describe a “reproducible accumulator” data structure (the “binned number”) and associated algorithms to reproducibly sum binary floating point numbers, independent of summation order. We use a subset of the IEEE Floating Point Standard 754-2008 and bitwise operations on the standard representations in memory. Our approach requires only one read-only pass over the data, and one reduction in parallel, using a 6-word reproducible accumulator (more words can be used for higher accuracy), enabling standard tiling optimization techniques. Summing n words with a 6-word reproducible accumulator requires approximately 9 n floating point operations (arithmetic, comparison, and absolute value) and approximately 3 n bitwise operations. The final error bound with a 6-word reproducible accumulator and our default settings can be up to 2 29 times smaller than the error bound for conventional (recursive) summation on ill-conditioned double-precision inputs.

Funder

Darpa XDATA

HP

Nokia

DOE Computational Science Graduate Fellowship

Mathworks

DARPA

ASPIRE Lab

LGE

Samsung

Cray

NSF

Intel

DOE

Intel ITSC

Google

Huawei

NVIDIA

Oracle

Aramco

Publisher

Association for Computing Machinery (ACM)

Subject

Applied Mathematics,Software

Reference31 articles.

1. Intel. 2018. Developer Reference for Intel® Math Kernel Library 2018 - C | Intel® Software. Retrieved from https://software.intel.com/en-us/download/developer-reference-for-intel-math-kernel-library-2018-c. Intel. 2018. Developer Reference for Intel® Math Kernel Library 2018 - C | Intel® Software. Retrieved from https://software.intel.com/en-us/download/developer-reference-for-intel-math-kernel-library-2018-c.

2. NVIDIA. 2018. NVIDIA® cuBLAS. Retrieved from http://docs.nvidia.com/cuda/cublas/index.html. NVIDIA. 2018. NVIDIA® cuBLAS. Retrieved from http://docs.nvidia.com/cuda/cublas/index.html.

3. Intel. 2019. bfloat16 - HardwareNumerics Definition. Retrieved from https://software.intel.com/sites/default/files/managed/40/8b/bf16-hardware-numerics-definition-white-paper.pdf. Intel. 2019. bfloat16 - HardwareNumerics Definition. Retrieved from https://software.intel.com/sites/default/files/managed/40/8b/bf16-hardware-numerics-definition-white-paper.pdf.

Cited by 12 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Integration of Posit Arithmetic in RISC-V Targeting Low-Power Computations;2024 IEEE 24th International Conference on Nanotechnology (NANO);2024-07-08

2. Useful applications of correctly-rounded operators of the form ab + cd + e;2024 IEEE 31st Symposium on Computer Arithmetic (ARITH);2024-06-10

3. Asynchronous Multi-Level Checkpointing: An Enabler of Reproducibility using Checkpoint History Analytics;Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis;2023-11-12

4. ddRingAllreduce: a high-precision RingAllreduce algorithm;CCF Transactions on High Performance Computing;2023-07-05

5. Improving accuracy of summation using parallel vectorized Kahan's and Gill‐Møller algorithms;Concurrency and Computation: Practice and Experience;2023-05-10

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3