Complexity of rule sets in mining incomplete data using characteristic sets and generalized maximal consistent blocks

Author:

Clark Patrick G1,Gao Cheng1,Grzymala-Busse Jerzy W2,Mroczek Teresa3,Niemiec Rafal3

Affiliation:

1. Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS 66045, USA

2. Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS 66045, USA and Department of Expert Systems and Artificial Intelligence, University of Information Technology and Management, 35-225 Rzeszow, Poland

3. Department of Expert Systems and Artificial Intelligence, University of Information Technology and Management, 35-225 Rzeszow, Poland

Abstract

Abstract In this paper, missing attribute values in incomplete data sets have three possible interpretations: lost values, attribute-concept values and ‘do not care’ conditions. For rule induction, we use characteristic sets and generalized maximal consistent blocks. Therefore, we apply six different approaches for data mining. As follows from our previous experiments, where we used an error rate evaluated by ten-fold cross validation as the main criterion of quality, no approach is universally the best. Thus, we decided to compare our six approaches using complexity of rule sets induced from incomplete data sets. We show that the smallest rule sets are induced from incomplete data sets with attribute-concept values, while the most complicated rule sets are induced from data sets with lost values. The choice between interpretations of missing attribute values is more important than the choice between characteristic sets and generalized maximal consistent blocks.

Publisher

Oxford University Press (OUP)

Subject

Logic

Reference20 articles.

1. Characteristic sets and generalized maximal consistent blocks in mining incomplete data;Clark,2017

2. Complexity of rule sets in mining incomplete data using characteristic sets and generalized maximal consistent blocks;Clark,2018

3. Experiments on probabilistic approximations;Clark,2011

4. Experiments using three probabilistic approximations for rule induction from incomplete data sets;Clark,2012

5. LERS—a system for learning from examples based on rough sets;Grzymala-Busse,1992

Cited by 4 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Maximal consistent block based optimal scale selection for incomplete multi-scale information systems;International Journal of Machine Learning and Cybernetics;2023-01-09

2. Structures Derived from Possible Tables in an Incomplete Information Table;2022 Joint 12th International Conference on Soft Computing and Intelligent Systems and 23rd International Symposium on Advanced Intelligent Systems (SCIS&ISIS);2022-11-29

3. Rough Sets Turn 40: From Information Systems to Intelligent Systems;Annals of Computer Science and Information Systems;2022-09-26

4. A New Approach to Constructing Maximal Consistent Blocks for Mining Incomplete Data;Procedia Computer Science;2022

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3