Abstract
AbstractIn this paper, we propose an efficient and highly accurate method for data stream classification, called discriminative associative classification. We define class discriminative association rules (CDARs) as the class association rules (CARs) in one data stream that have higher support compared with the same rules in the rest of the data streams. Compared to associative classification mining in a single data stream, there are additional challenges in the discriminative associative classification mining in multiple data streams, as the Apriori property of the subset is not applicable. The proposed single-pass H-DAC algorithm is designed based on distinguishing features of the rules to improve classification accuracy and efficiency. Continuously arriving transactions are inserted at fast speed and large volume, and CDARs are discovered in the tilted-time window model. The data structures are dynamically adjusted in offline time intervals to reflect each rule supported in different periods. Empirical analysis shows the effectiveness of the proposed method in the large fast speed data streams. Good efficiency is achieved for batch processing of small and large datasets, plus 0–2% improvements in classification accuracy using the tilted-time window model (i.e., almost with zero overhead). These improvements are seen only for the first 32 incoming batches in the scale of our experiments and we expect better results as the data streams grow.
Funder
Queensland University of Technology
Publisher
Springer Science and Business Media LLC
Subject
Geometry and Topology,Theoretical Computer Science,Software
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献