Efficient, out-of-memory sparse MTTKRP on massively parallel architectures-Reference-Cited by-同舟云学术

Efficient, out-of-memory sparse MTTKRP on massively parallel architectures

Published:2022-06-28 Issue: Volume: Page:
ISSN:
Container-title:Proceedings of the 36th ACM International Conference on Supercomputing
language:
Short-container-title:

Author:

Nguyen Andy¹,Helal Ahmed E.²,Checconi Fabio²,Laukemann Jan³,Tithi Jesmin Jahan²,Soh Yongseok¹,Ranadive Teresa⁴,Petrini Fabrizio²,Choi Jee W.¹

Affiliation:

1. University of Oregon

2. Intel Labs

3. University of Erlangen-Nürnberg

4. Laboratory for Physical Sciences

Publisher

ACM

Link

https://dl.acm.org/doi/pdf/10.1145/3524059.3532363

Reference55 articles.

1. 2022. Intel DPC++ Compatibility Tool . https://software.intel.com/en-us/get-started-with-intel-dpcpp-compatibility-tool. Online ; accessed 14 May 2022 . 2022. Intel DPC++ Compatibility Tool. https://software.intel.com/en-us/get-started-with-intel-dpcpp-compatibility-tool. Online; accessed 14 May 2022.

2. 2022. Nsight Compute Command Line Interface. https://docs.nvidia.com/nsight-compute/pdf/NsightComputeCli.pdf. Online ; accessed 14 May 2022 . 2022. Nsight Compute Command Line Interface. https://docs.nvidia.com/nsight-compute/pdf/NsightComputeCli.pdf. Online; accessed 14 May 2022.

3. 2022. Nsight Systems User Guide. https://docs.nvidia.com/nsight-systems/pdf/UserGuide.pdf. Online ; accessed 14 May 2022 . 2022. Nsight Systems User Guide. https://docs.nvidia.com/nsight-systems/pdf/UserGuide.pdf. Online; accessed 14 May 2022.

4. Efficient MATLAB Computations with Sparse and Factored Tensors

5. Implementing sparse matrix-vector multiplication on throughput-oriented processors

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Accelerated Constrained Sparse Tensor Factorization on Massively Parallel Architectures;Proceedings of the 53rd International Conference on Parallel Processing;2024-08-12

2. Distributed-Memory Randomized Algorithms for Sparse Tensor CP Decomposition;Proceedings of the 36th ACM Symposium on Parallelism in Algorithms and Architectures;2024-06-17

3. Sparse MTTKRP Acceleration for Tensor Decomposition on GPU;Proceedings of the 21st ACM International Conference on Computing Frontiers;2024-05-07

4. A Novel Parallel Algorithm for Sparse Tensor Matrix Chain Multiplication via TCU-Acceleration;IEEE Transactions on Parallel and Distributed Systems;2023-08

5. Dynamic Tensor Linearization and Time Slicing for Efficient Factorization of Infinite Data Streams;2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS);2023-05