Affiliation:
1. Massachusetts Institute of Technology, USA
Abstract
This paper shows how to build a sparse tensor algebra compiler that is agnostic to tensor formats (data layouts). We develop an interface that describes formats in terms of their capabilities and properties, and show how to build a modular code generator where new formats can be added as plugins. We then describe six implementations of the interface that compose to form the dense, CSR/CSF, COO, DIA, ELL, and HASH tensor formats and countless variants thereof. With these implementations at hand, our code generator can generate code to compute any tensor algebra expression on any combination of the aforementioned formats.
To demonstrate our technique, we have implemented it in the taco tensor algebra compiler. Our modular code generator design makes it simple to add support for new tensor formats, and the performance of the generated code is competitive with hand-optimized implementations. Furthermore, by extending taco to support a wider range of formats specialized for different application and data characteristics, we can improve end-user application performance. For example, if input data is provided in the COO format, our technique allows computing a single matrix-vector multiplication directly with the data in COO, which is up to 3.6× faster than by first converting the data to CSR.
Funder
U.S. Department of Energy
Toyota Research Institute
National Science Foundation
Defense Advanced Research Projects Agency
Application Driving Architectures Research Center
Publisher
Association for Computing Machinery (ACM)
Subject
Safety, Risk, Reliability and Quality,Software
Reference54 articles.
1. Karin A. Remington. 1996. NIST Sparse BLAS User’s Guide. (08 1996). Karin A. Remington. 1996. NIST Sparse BLAS User’s Guide. (08 1996).
2. Tensor Decompositions for Learning Latent Variable Models;Anandkumar Animashree;J. Mach. Learn. Res. 15, Article 1,2014
3. Gilad Arnold. 2011. Data-Parallel Language for Correct and Efficient Sparse Matrix Codes. Ph.D. Dissertation. University of California Berkeley. Gilad Arnold. 2011. Data-Parallel Language for Correct and Efficient Sparse Matrix Codes. Ph.D. Dissertation. University of California Berkeley.
4. Specifying and verifying sparse matrix codes
5. Automatic code generation for many-body electronic structure methods: the tensor contraction engine‡‡
Cited by
80 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. FreeStencil: A Fine-Grained Solver Compiler with Graph and Kernel Optimizations on Structured Meshes for Modern GPUs;Proceedings of the 53rd International Conference on Parallel Processing;2024-08-12
2. SpEQ: Translation of Sparse Codes using Equivalences;Proceedings of the ACM on Programming Languages;2024-06-20
3. Compilation of Modular and General Sparse Workspaces;Proceedings of the ACM on Programming Languages;2024-06-20
4. Mechanised Hypersafety Proofs about Structured Data;Proceedings of the ACM on Programming Languages;2024-06-20
5. eCC++ : A Compiler Construction Framework for Embedded Domain-Specific Languages;2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW);2024-05-27