Author:
Jankowski C.,Reda D.,Mańkowski M.,Borowik G.
Abstract
Abstract
Discretization is one of the most important parts of decision table preprocessing. Transforming continuous values of attributes into discrete intervals influences further analysis using data mining methods. In particular, the accuracy of generated predictions is highly dependent on the quality of discretization. The paper contains a description of three new heuristic algorithms for discretization of numeric data, based on Boolean reasoning. Additionally, an entropy-based evaluation of discretization is introduced to compare the results of the proposed algorithms with the results of leading university software for data analysis. Considering the discretization as a data compression method, the average compression ratio achieved for databases examined in the paper is 8.02 while maintaining the consistency of databases at 100%.
Subject
Artificial Intelligence,Computer Networks and Communications,General Engineering,Information Systems,Atomic and Molecular Physics, and Optics
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献