Abstract
Purpose
Frequent itemset mining (FIM) is a basic topic in data mining. Most FIM methods build itemset database containing all possible itemsets, and use predefined thresholds to determine whether an itemset is frequent. However, the algorithm has some deficiencies. It is more fit for discrete data rather than ordinal/continuous data, which may result in computational redundancy, and some of the results are difficult to be interpreted. The purpose of this paper is to shed light on this gap by proposing a new data mining method.
Design/methodology/approach
Regression pattern (RP) model will be introduced, in which the regression model and FIM method will be combined to solve the existing problems. Using a survey data of computer technology and software professional qualification examination, the multiple linear regression model is selected to mine associations between items.
Findings
Some interesting associations mined by the proposed algorithm and the results show that the proposed method can be applied in ordinal/continuous data mining area. The experiment of RP model shows that, compared to FIM, the computational redundancy decreased and the results contain more information.
Research limitations/implications
The proposed algorithm is designed for ordinal/continuous data and is expected to provide inspiration for data stream mining and unstructured data mining.
Practical implications
Compared to FIM, which mines associations between discrete items, RP model could mine associations between ordinal/continuous data sets. Importantly, RP model performs well in saving computational resource and mining meaningful associations.
Originality/value
The proposed algorithms provide a novelty view to define and mine association.
Subject
Library and Information Sciences,Information Systems
Reference32 articles.
1. Mining association rules between sets of items in large databases,1993
2. An efficient method for mining frequent weighted closed itemsets from weighted item transaction databases;Journal of Information Science and Engineering,2017
3. Model-based probabilistic frequent itemset mining;Knowledge and Information Systems,2013
4. Performance evaluation of the GIS-based data mining techniques of best-first decision tree, random forest, and naive Bayes tree for landslide susceptibility modeling;Science of the Total Environment,2018
5. Objectminer: a new approach for mining complex objects,2015
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. IHUMN: an improved high-utility itemsets mining algorithm with negative utility items;Proceedings of the 2022 5th International Conference on Algorithms, Computing and Artificial Intelligence;2022-12-23