Abstract
In this study we implemented four different versions of Apriori, namely, basic and basic multi-threaded, bloom filter, trie, and count-min sketch, and proposed a new algorithm – NCLAT (Near Candidate-Less Apriori with Tidlists). We compared the runtimes and max memory usages of our implementations among each other as well as with the runtime of Borgelt’s Apriori implementation in some of the cases. NCLAT implementation is more efficient than the other Apriori implementations that we know of in terms of the number of times the database is scanned, and the number of candidates generated. Unlike the original Apriori algorithm which scans the database for every level and creates all of the candidates in advance for each level, NCLAT scans the database only once and creates candidate itemsets only for level one but not afterwards. Thus the number of candidates created is equal to the number of unique items in the database.
Reference15 articles.
1. Mining association rules between sets of items in large databases
2. Agrawal, R., & Srikant, R. (1994, September). Fast algorithms for mining association rules. In Proc. of 20th Int. Conf. Very Large Data Bases (pp. 487-499). Academic Press.
3. Efficient Implementations for UWEP Incremental Frequent Itemset Mining Algorithm
4. Bicer, M., & Zhang, X. (2019, May). An efficient, hybrid, double-hash string matching algorithm. In Proc. of the 15th IEEE Long Island Systems,Applications and Technology Conf. (pp. 1-5). IEEE.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献