Abstract
Data Mining means a process of nontrivial extraction of implicit, previously and potentially useful information from data in databases. Mining closed large itemsets is a further work of mining association rules, which aims to find the set of necessary subsets of large itemsets that could be representative of all large itemsets. In this paper, we design a hybrid approach, considering the character of data, to mine the closed large itemsets efficiently. Two features of market basket analysis are considered – the number of items is large; the number of associated items for each item is small. Combining the cut-point method and the hash concept, the new algorithm can find the closed large itemsets efficiently. The simulation results show that the new algorithm outperforms the FP-CLOSE algorithm in the execution time and the space of storage.
Publisher
Trans Tech Publications, Ltd.
Reference11 articles.
1. R. Agrawal and R. Srikant, ``Fast Algorithms for Mining Association Rules, Proc. of the 20th Int. Conf. on Very Large Data Bases, pp.487-499, (1994).
2. R. Agrawal and R. Srikant, ``Mining Sequential Patterns, Proc. of the 11th Int. Conf. on Data Engineering, pp.3-14, (1995).
3. J. Han, H. Cheng, D. Xin, and X. Yan, ``Frequent Pattern Mining: Current Status and Future Directions, Data Mining and Knowledge Discovery, Vol. 15, No. 1, pp.55-86, Aug. (2007).
4. L. W. Huang and Y. I. Chang, ``An Efficient Graph-Based Approach to Mining Association Rules for Large Databases, International Journal of Intelligent Information and Database Systems, Vol. 3, No. 3, pp.274-259, (2009).
5. C. Kamath, ``The Role of Parallel and Distributed Processing in Data Mining, Tech. Rep. UCRL-JC-142468, Newsletter of the IEEE Technical Committee on Distributed Processing, (2001).