Affiliation:
1. Centre de Recherche en Informatique de Lens (CRIL), Université d’Artois & CNRS, Rue Jean Souvraz SP-18, F-62307 Lens Cedex, France
Abstract
Several propositional satisfiability (SAT) based encodings have been proposed to deal with various data mining problems including itemset and sequence mining problems. This research issue allows to model data mining problems in a declarative way, while exploiting efficient SAT solving techniques. In this paper, we overview our contributions on the application of propositional satisfiability (SAT) to model and solve itemset mining tasks. We first present a SAT based encoding of frequent closed itemset mining task as a propositional formula whose models corresponds to the patterns to be mined. Secondly, we show that some data mining constraints can be avoided by reformulation. We illustrate this issue by reformulating the closeness constraint using the notion of minimal models. Finally, we also addressed the scalability issue, one of the most important challenge of these nice declarative framework. To this end, we proposed a complete partition based approach whose aim is to avoid encoding the whole database as a single formula. Using a partition on the set of items, our new approach leads to several propositional formulas of reasonable size. The experimental evaluation on several known datasets shows huge improvements in comparison to the direct approach without partitioning, while reducing significantly the performance gap with respect to specialized algorithms.
Publisher
World Scientific Pub Co Pte Lt
Subject
Artificial Intelligence,Artificial Intelligence
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献