Mining frequent itemsets from streaming transaction data using genetic algorithms-Reference-Cited by-同舟云学术

Mining frequent itemsets from streaming transaction data using genetic algorithms

Published:2020-07-25 Issue:1 Volume:7 Page:
ISSN:2196-1115
Container-title:Journal of Big Data
language:en
Short-container-title:J Big Data

Author:

Bagui Sikha^ORCID,Stanley Patrick

Abstract

AbstractThis paper presents a study of mining frequent itemsets from streaming data in the presence of concept drift. Streaming data, being volatile in nature, is particularly challenging to mine. An approach using genetic algorithms is presented, and various relationships between concept drift, sliding window size, and genetic algorithm constraints are explored. Concept drift is identified by changes in frequent itemsets. The novelty of this work lies in determining concept drift using frequent itemsets for mining streaming data, using the genetic algorithm framework. Formulas have been presented for calculating minimum support counts in streaming data using sliding windows. Testing highlighted that the ratio of the window size to transactions per drift was a key to good performance. Getting good results when the sliding window size was too small was a challenge since normal fluctuations in the data could appear to be a concept drift. Window size must be managed in conjunction with support and confidence values in order to achieve reasonable results. This method of detecting concept drift performed well when larger window sizes were used.

Publisher

Springer Science and Business Media LLC

Subject

Information Systems and Management,Computer Networks and Communications,Hardware and Architecture,Information Systems

Link

https://link.springer.com/content/pdf/10.1186/s40537-020-00330-9.pdf

Reference22 articles.

1. Agrawal R, Imielinski T, Swami A. Mining association rules between sets of items in large databases. In: Proceedings of the ACM SIGMOD international conference on management of data, Washington, D.C., USA. 1993. pp. 207–216.

2. Aldallal AS. Avoiding premature convergence of GA in informational retrieval systems. Int J Intell Syst Appl Eng. 2015;2(4):80. https://doi.org/10.18201/ijisae.78975.

3. Angelova M, Pencheva T. Tuning GA parameters to improve convergence time. Int J Chem Eng. 2011. https://doi.org/10.1155/2011/646917.

4. Bull AD. Convergence rates of efficient global optimization algorithms. J Mach Learn Res. 2011;12(88):2879–904.

5. Forrest S, Mitchell M. What makes a problem hard for a GA? Some anomalous results and their explanation. GAs Mach Learn. 1993. https://doi.org/10.1007/978-1-4615-2740-4_6.

Cited by 9 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. High Utility Itemset Extraction using PSO with Online Control Parameter Calibration;International Journal of Next-Generation Computing;2024-05-14

2. Probabilistic Support Prediction: Fast Frequent Itemset Mining in Dense Data;IEEE Access;2024

3. A High Utility Itemset Mining Algorithm Based on Particle Filter;Mathematical Problems in Engineering;2023-02-23

4. An Effective Reference-Point-Set (RPS) Based Bi-Directional Frequent Itemset Generation;The International Arab Journal of Information Technology;2023

5. Data Reduction Based on Adaptive Stream Window Size for IoT Data;2022 International Conference for Natural and Applied Sciences (ICNAS);2022-05-14