Author:
Mardiansyah Heru,Zarlis Muhammad,Sitompul Opim Salim
Abstract
Abstract
The C4.5 algorithm still has weaknesses in predicting or classifying data if a large number of classes are used which can lead to increased decision-making time. So an approach is needed to improve the performance of the C4.5 algorithm with the selected split attributes that use the application of the average gain value to help predictions. The C4.5 algorithm is one of the Decision Tree methods in the classification process using the information entropy concept. The C4.5 algorithm uses the split criteria from ID3, the Gain Ratio is a modification of the method. The ID3 algorithm uses Information Gain (IG) for the split attribute criteria, while the C4.5 algorithm with Gain Ratio (GR), where the root value comes from high gain. The conclusion of the tests that have been carried out using the Water Quality dataset in the C4.5 method has an accuracy rate of 91.30%, with a classification error rate of 8.70%. Successful implementation using the C4.5 method in predicting the Water Quality dataset.
Subject
General Physics and Astronomy
Reference7 articles.
1. Research of Decision Tree Classification Algorithm in Data Mining;Dai;Int. J of Database Theory and App.,2016
2. Decision tree classifiers sensitive to heterogeneous costs;Zhang;J of Systems and Software,2012
3. Research on C5.0 Algorithm Improvement and the Test in Lightning Disaster Statistics;Hou;Int. J. of Control and Automation,2014
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献