P ar C ube-Reference-Cited by-同舟云学术

P ar C ube

Published:2015-07-27 Issue:1 Volume:10 Page:1-25
ISSN:1556-4681
Container-title:ACM Transactions on Knowledge Discovery from Data
language:en
Short-container-title:ACM Trans. Knowl. Discov. Data

Author:

Papalexakis Evangelos E.¹,Faloutsos Christos¹,Sidiropoulos Nicholas D.²

Affiliation:

1. Carnegie Mellon University, PA, USA

2. University of Minnesota, Minneapolis, MN

Abstract

How can we efficiently decompose a tensor into sparse factors, when the data do not fit in memory? Tensor decompositions have gained a steadily increasing popularity in data-mining applications; however, the current state-of-art decomposition algorithms operate on main memory and do not scale to truly large datasets. In this work, we propose P ar C ube , a new and highly parallelizable method for speeding up tensor decompositions that is well suited to produce sparse approximations. Experiments with even moderately large data indicate over 90% sparser outputs and 14 times faster execution, with approximation error close to the current state of the art irrespective of computation and memory requirements. We provide theoretical guarantees for the algorithm’s correctness and we experimentally validate our claims through extensive experiments, including four different real world datasets (E nron , L bnl , F acebook and N ell ), demonstrating its effectiveness for data-mining practitioners. In particular, we are the first to analyze the very large N ell dataset using a sparse tensor decomposition, demonstrating that P ar C ube enables us to handle effectively and efficiently very large datasets. Finally, we make our highly scalable parallel implementation publicly available, enabling reproducibility of our work.

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/2729980

Reference43 articles.

1. Multiway analysis of epilepsy tensors

2. The N-way Toolbox for MATLAB

3. B. W. Bader M. W. Berry and M. Browne. 2008. Discussion tracking in enron email using PARAFAC. Survey of Text Mining II 147--163. B. W. Bader M. W. Berry and M. Browne. 2008. Discussion tracking in enron email using PARAFAC. Survey of Text Mining II 147--163.

Cited by 23 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Multi-level optimization of the canonical polyadic tensor decomposition at large-scale: Application to the stratification of social networks through deflation;Information Systems;2023-02

2. Algorithm 1026: Concurrent Alternating Least Squares for Multiple Simultaneous Canonical Polyadic Decompositions;ACM Transactions on Mathematical Software;2022-09-10

3. BALA-CPD: BALanced and Asynchronous Distributed Tensor Decomposition;2022 IEEE International Conference on Cluster Computing (CLUSTER);2022-09

4. Large Scale Tensor Factorization via Parallel Sketches;IEEE Transactions on Knowledge and Data Engineering;2022-01-01

5. Partensor;Tensors for Data Processing;2022