Custom High-Performance Vector Code Generation for Data-Specific Sparse Computations-Reference-Cited by-同舟云学术

Custom High-Performance Vector Code Generation for Data-Specific Sparse Computations

Published:2022-10-08 Issue: Volume: Page:
ISSN:
Container-title:Proceedings of the International Conference on Parallel Architectures and Compilation Techniques
language:
Short-container-title:

Author:

Horro Marcos¹,Pouchet Louis-Noël²,Rodríguez Gabriel¹,Touriño Juan¹

Affiliation:

1. Universidade da Coruña, A Coruña, Spain

2. Colorado State University

Funder

National Science Foundation

Xunta de Galicia

Ministry of Science and Innovation, Spain

Ministry of Education, Spain

Publisher

ACM

Link

https://dl.acm.org/doi/pdf/10.1145/3559009.3569668

Reference46 articles.

1. A. Abel and J. Reineke . 2019. uops.info: Characterizing Latency, Throughput, and Port Usage of Instructions on Intel Microarchitectures . In Intl. Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS . Providence, RI, USA, 673--686. A. Abel and J. Reineke. 2019. uops.info: Characterizing Latency, Throughput, and Port Usage of Instructions on Intel Microarchitectures. In Intl. Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS. Providence, RI, USA, 673--686.

2. uiCA: Accurate Throughput Prediction of Basic Blocks on Recent Intel Microarchitectures. In Proceedings of the 36th ACM International Conference on Supercomputing;Abel A.;ICS. Virtual Event, USA,2022

3. A. Ashari , N. Sedaghati , J. Eisenlohr , S. Parthasarathy , and P. Sadayappan . 2014 . Fast Sparse Matrix-vector Multiplication on GPUs for Graph Applications. In Intl. Conference for High Performance Computing, Networking, Storage and Analysis, SC . New Orleans, LA, USA, 781--792. A. Ashari, N. Sedaghati, J. Eisenlohr, S. Parthasarathy, and P. Sadayappan. 2014. Fast Sparse Matrix-vector Multiplication on GPUs for Graph Applications. In Intl. Conference for High Performance Computing, Networking, Storage and Analysis, SC. New Orleans, LA, USA, 781--792.

4. T. Augustine , J. Sarma , L.-N. Pouchet , and G. Rodríguez . 2019. Generating piecewise-regular code from irregular structures . In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI. 625--639 . T. Augustine, J. Sarma, L.-N. Pouchet, and G. Rodríguez. 2019. Generating piecewise-regular code from irregular structures. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI. 625--639.

5. N. Bell and M. Garland. 2008. Efficient Sparse Matrix-Vector Multiplication on CUDA. NVIDIA Technical Report NVR-2008-004. NVIDIA Corporation. N. Bell and M. Garland. 2008. Efficient Sparse Matrix-Vector Multiplication on CUDA. NVIDIA Technical Report NVR-2008-004. NVIDIA Corporation.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Register Tiling for Unstructured Sparsity in Neural Network Inference;Proceedings of the ACM on Programming Languages;2023-06-06