A Hybrid Algorithm of Mining Closed Itemsets for Large Databases-Reference-Cited by-同舟云学术

A Hybrid Algorithm of Mining Closed Itemsets for Large Databases

Published:2011-12 Issue: Volume:145 Page:292-296
ISSN:1662-7482
Container-title:Applied Mechanics and Materials
language:
Short-container-title:AMM

Author:

Huang Lee Wen¹

Affiliation:

1. Far East University

Abstract

Data Mining means a process of nontrivial extraction of implicit, previously and potentially useful information from data in databases. Mining closed large itemsets is a further work of mining association rules, which aims to find the set of necessary subsets of large itemsets that could be representative of all large itemsets. In this paper, we design a hybrid approach, considering the character of data, to mine the closed large itemsets efficiently. Two features of market basket analysis are considered – the number of items is large; the number of associated items for each item is small. Combining the cut-point method and the hash concept, the new algorithm can find the closed large itemsets efficiently. The simulation results show that the new algorithm outperforms the FP-CLOSE algorithm in the execution time and the space of storage.

Publisher

Trans Tech Publications, Ltd.

Link

https://www.scientific.net/AMM.145.292.pdf

Reference11 articles.

1. R. Agrawal and R. Srikant, ``Fast Algorithms for Mining Association Rules, Proc. of the 20th Int. Conf. on Very Large Data Bases, pp.487-499, (1994).

2. R. Agrawal and R. Srikant, ``Mining Sequential Patterns, Proc. of the 11th Int. Conf. on Data Engineering, pp.3-14, (1995).

3. J. Han, H. Cheng, D. Xin, and X. Yan, ``Frequent Pattern Mining: Current Status and Future Directions, Data Mining and Knowledge Discovery, Vol. 15, No. 1, pp.55-86, Aug. (2007).

4. L. W. Huang and Y. I. Chang, ``An Efficient Graph-Based Approach to Mining Association Rules for Large Databases, International Journal of Intelligent Information and Database Systems, Vol. 3, No. 3, pp.274-259, (2009).

5. C. Kamath, ``The Role of Parallel and Distributed Processing in Data Mining, Tech. Rep. UCRL-JC-142468, Newsletter of the IEEE Technical Committee on Distributed Processing, (2001).