On the Use of BLAS Libraries in Modern Scientific Codes at Scale-Reference-Cited by-同舟云学术

On the Use of BLAS Libraries in Modern Scientific Codes at Scale

Published:2020 Issue: Volume: Page:67-79
ISSN:1865-0929
Container-title:Communications in Computer and Information Science
language:
Short-container-title:

Author:

Waugh Harry,McIntosh-Smith Simon

Publisher

Springer International Publishing

Link

https://link.springer.com/content/pdf/10.1007/978-3-030-63393-6_5

Reference13 articles.

1. NVIDIA A100 Tensor Core GPU Architecture: Unprecedented Acceleration At Every Scale (2020). https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf

2. The x86 Advanced Matrix Extension (AMX) Brings Matrix Operations; To Debut with Sapphire Rapids (2020). https://fuse.wikichip.org/news/3600/the-x86-advanced-matrix-extension-amx-brings-matrix-operations-to-debut-with-sapphire-rapids/

3. Dongarra, J., Hammarling, S., Higham, N., Relton, S., Valero-Lara, P., Zounon, M.: The design and performance of batched BLAS on modern high-performance computing systems. Proc. Comput. Sci. 108, 495–504 (2017). https://doi.org/10.1016/j.procs.2017.05.138

4. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016). http://www.deeplearningbook.org

5. Gustafson, J.L.: Amdahl’s Law. In: Padua, D. (ed.) Encyclopedia of Parallel Computing, vol. xx, pp. 53–60. Springer, US, Boston, MA (2011). https://doi.org/10.1007/978-07-09766-4_77

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Fast matrix multiplication via compiler‐only layered data reorganization and intrinsic lowering;Software: Practice and Experience;2023-05-14