Affiliation:
1. College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
Abstract
The amount of scientific data is currently growing at an unprecedented pace, with tensors being a common form of data that display high-order, high-dimensional, and sparse features. While tensor-based analysis methods are effective, the vast increase in data size has made processing the original tensor infeasible. Tensor decomposition offers a solution by decomposing the tensor into multiple low-rank matrices or tensors that can be efficiently utilized by tensor-based analysis methods. One such algorithm is the Tucker decomposition, which decomposes an
N
-order tensor into
N
low-rank factor matrices and a low-rank core tensor. However, many Tucker decomposition techniques generate large intermediate variables and require significant computational resources, rendering them inadequate for processing high-order and high-dimensional tensors. This article introduces FasterTucker decomposition, a novel approach to tensor decomposition that builds on the FastTucker decomposition, a variant of the Tucker decomposition. We propose an efficient parallel FasterTucker decomposition algorithm, called cuFasterTucker, designed to run on a GPU platform. Our algorithm has low storage and computational requirements and provides an effective solution for high-order and high-dimensional sparse tensor decomposition. Compared to state-of-the-art algorithms, our approach achieves a speedup of approximately 7 to 23 times.
Funder
National Key R&D Program of China
Key Program of National Natural Science Foundation of China
National Natural Science Foundation of China
Publisher
Association for Computing Machinery (ACM)