Abstract
AbstractIn this paper, we present an efficient novel method for mining discriminative itemsets over data streams using the sliding window model. Discriminative itemsets are the itemsets that are frequent in the target data stream, and their frequency in the target stream is much higher in comparison to their frequency in the rest of the streams. The problem of mining discriminative itemsets has more challenges than mining frequent itemsets, especially in the sliding window model, as during the window frame sliding, the algorithms have to deal with the combinatorial explosion of itemsets in more than one data stream, for the transactions coming in and going out of the sliding window. We propose a single scan algorithm using two novel in-memory data structures for mining discriminative itemsets in a combination of offline and online sliding windows. Offline processing is used for controlling the generation of many unpromising itemsets. Online processing is used for getting more up-to-date and accurate online answers between two offline slidings. The discovered discriminative itemsets are accurately updated in the offline sliding window periodically, and the mining process is continued in the online sliding between two periodic offline slidings. The extensive empirical analysis shows that the proposed algorithm provides efficient time and space complexities with full accuracy. The algorithm can handle large, fast-speed, and complex data streams.
Funder
Queensland University of Technology
Publisher
Springer Science and Business Media LLC
Subject
Computer Science Applications,Computer Networks and Communications,Computer Graphics and Computer-Aided Design,Computational Theory and Mathematics,Artificial Intelligence,General Computer Science