Scalable parallel data mining for association rules-Reference-Cited by-同舟云学术

Scalable parallel data mining for association rules

Published:1997-06 Issue:2 Volume:26 Page:277-288
ISSN:0163-5808
Container-title:ACM SIGMOD Record
language:en
Short-container-title:SIGMOD Rec.

Author:

Han Eui-Hong¹,Karypis George¹,Kumar Vipin¹

Affiliation:

1. Department of Computer Science, University of Minnesota, Minneapolis, MN

Abstract

One of the important problems in data mining is discovering association rules from databases of transactions where each transaction consists of a set of items. The most time consuming operation in this discovery process is the computation of the frequency of the occurrences of interesting subset of items (called candidates) in the database of transactions. To prune the exponentially large space of candidates, most existing algorithms, consider only those candidates that have a user defined minimum support. Even with the pruning, the task of finding all association rules requires a lot of computation power and time. Parallel computers offer a potential solution to the computation requirement of this task, provided efficient and scalable parallel algorithms can be designed. In this paper, we present two new parallel algorithms for mining association rules. The Intelligent Data Distribution algorithm efficiently uses aggregate memory of the parallel computer by employing intelligent candidate partitioning scheme and uses efficient communication mechanism to move data among the processors. The Hybrid Distribution algorithm further improves upon the Intelligent Data Distribution algorithm by dynamically partitioning the candidate set to maintain good load balance. The experimental results on a Cray T3D parallel computer show that the Hybrid Distribution algorithm scales linearly and exploits the aggregate memory better and can generate more association rules with a single scan of database per pass.

Publisher

Association for Computing Machinery (ACM)

Subject

Information Systems,Software

Link

https://dl.acm.org/doi/pdf/10.1145/253262.253330

Reference13 articles.

1. Mining association rules between sets of items in large databases

2. Parallel mining of association rules

Cited by 31 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Incremental high average-utility itemset mining: survey and challenges;Scientific Reports;2024-04-30

2. Degradation of edible mushroom waste by Hermetia illucens L. and consequent adaptation of its gut microbiota;Scientific Reports;2024-04-30

3. Mine-first association rule mining: An integration of independent frequent patterns in distributed environments;Decision Analytics Journal;2024-03

4. A Comparative Study on Association Rule Mining in Distributed Data Mining;2024 IEEE International Conference on Computing, Power and Communication Technologies (IC2PCT);2024-02-09

5. High Frequency Rule Synthesis in a Large Scale Multiple Database with MapReduce;International Journal of Electronics and Telecommunications;2023-07-26