Computational Complexity and ILP Models for Pattern Problems in the Logical Analysis of Data-Reference-Cited by-同舟云学术

Computational Complexity and ILP Models for Pattern Problems in the Logical Analysis of Data

Published:2021-08-09 Issue:8 Volume:14 Page:235
ISSN:1999-4893
Container-title:Algorithms
language:en
Short-container-title:Algorithms

Author:

Lancia Giuseppe^ORCID,Serafini Paolo

Abstract

Logical Analysis of Data is a procedure aimed at identifying relevant features in data sets with both positive and negative samples. The goal is to build Boolean formulas, represented by strings over {0,1,-} called patterns, which can be used to classify new samples as positive or negative. Since a data set can be explained in alternative ways, many computational problems arise related to the choice of a particular set of patterns. In this paper we study the computational complexity of several of these pattern problems (showing that they are, in general, computationally hard) and we propose some integer programming models that appear to be effective. We describe an ILP model for finding the minimum-size set of patterns explaining a given set of samples and another one for the problem of determining whether two sets of patterns are equivalent, i.e., they explain exactly the same samples. We base our first model on a polynomial procedure that computes all patterns compatible with a given set of samples. Computational experiments substantiate the effectiveness of our models on fairly large instances. Finally, we conjecture that the existence of an effective ILP model for finding a minimum-size set of patterns equivalent to a given set of patterns is unlikely, due to the problem being NP-hard and co-NP-hard at the same time.

Publisher

MDPI AG

Subject

Computational Mathematics,Computational Theory and Mathematics,Numerical Analysis,Theoretical Computer Science

Link

https://www.mdpi.com/1999-4893/14/8/235/pdf

Reference18 articles.

1. Data Mining: Concepts and Techniques;Jaiwei,2011

2. Data Mining: Concepts, Models, Methods, and Algorithms;Kantardzic,2003

3. Feature Selection for Classification

4. Feature Selection for Data Mining;Felici,2006

5. Advances in Feature Selection for Data and Pattern Recognition: An Introduction Advances in Feature Selection for Data and Pattern Recognition;Stanczyk,2018

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Supervised Classification Problem: Searching for Maximum Patterns;2024 X International Conference on Information Technology and Nanotechnology (ITNT);2024-05-20

2. Greedy algorithm for finding Pareto optimal patterns;AIP Conference Proceedings;2024

3. An Efficient Algorithm for K-Diagnosability Analysis of Bounded and Unbounded Petri Nets;IFAC-PapersOnLine;2024

4. Paired Patterns in Logical Analysis of Data for Decision Support in Recognition;Computation;2022-10-12

5. Efficient Process Scheduling for Multi-core Systems;2022 IEEE 8th Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS);2022-05