Affiliation:
1. Data Science Discipline, Science and Engineering faculty, Queensland University of Technology, Brisbane, Queensland 4000, Australia
Abstract
We tackle the problem of discriminative itemset mining. Given a set of datasets, we want to find the itemsets that are frequent in the target dataset and have much higher frequencies compared with the same itemsets in other datasets. Such itemsets are very useful for dataset discrimination. We demonstrate that this problem has important applications and, at a same time, is very challenging. We present the DISSparse algorithm, a mining method that uses two determinative heuristics based on the sparsity characteristics of the discriminative itemsets as a small subset of the frequent itemsets. We prove that the DISSparse algorithm is sound and complete. We experimentally investigate the performance of the proposed DISSparse on a range of datasets, evaluating its efficiency and stability and demonstrating it is substantially faster than the baseline method.
Publisher
World Scientific Pub Co Pte Ltd
Subject
Library and Information Sciences,Computer Networks and Communications,Computer Science Applications
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献