Author:
Bashir Shariq,Lai Daphne Teck Ching
Abstract
Approximate frequent itemsets (AFI) mining from noisy databases are computationally more expensive than traditional frequent itemset mining. This is because the AFI mining algorithms generate large number of candidate itemsets. This article proposes an algorithm to mine AFIs using pattern growth approach. The major contribution of the proposed approach is it mines core patterns and examines approximate conditions of candidate AFIs directly with single phase and two full scans of database. Related algorithms apply Apriori-based candidate generation and test approach and require multiple phases to obtain complete AFIs. First phase generates core patterns, and second phase examines approximate conditions of core patterns. Specifically, the article proposes novel techniques that how to map transactions on approximate FP-tree, and how to mine AFIs from the conditional patterns of approximate FP-tree. The approximate FP-tree maps transactions on shared branches when the transactions share a similar set of items. This reduces the size of databases and helps to efficiently compute the approximate conditions of candidate itemsets. We compare the performance of our algorithm with the state of the art AFI mining algorithms on benchmark databases. The experiments are analyzed by comparing the processing time of algorithms and scalability of algorithms on varying database size and transaction length. The results show pattern growth approach mines AFIs in less processing time than related Apriori-based algorithms.
Publisher
Kaunas University of Technology (KTU)
Subject
Electrical and Electronic Engineering,Computer Science Applications,Control and Systems Engineering
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献