Statistical analysis of various splitting criteria for decision trees-Reference-Cited by-同舟云学术

Statistical analysis of various splitting criteria for decision trees

Published:2023-01 Issue: Volume:17 Page:
ISSN:1748-3018
Container-title:Journal of Algorithms & Computational Technology
language:en
Short-container-title:Journal of Algorithms & Computational Technology

Author:

Aaboub Fadwa¹^ORCID,Chamlal Hasna¹,Ouaderhman Tayeb¹

Affiliation:

1. Fundamental and Applied Mathematics Laboratory, Department of Mathematics and Informatics, Faculty of Sciences Ain Chock, Hassan II University, Casablanca, Morocco

Abstract

Decision trees are frequently used to overcome classification problems in the fields of data mining and machine learning, owing to their many perks, including their clear and simple architecture, excellent quality, and resilience. Various decision tree algorithms are developed using a variety of attribute selection criteria, following the top-down partitioning strategy. However, their effectiveness is influenced by the choice of the splitting method. Therefore, in this work, six decision tree algorithms that are based on six different attribute evaluation metrics are gathered in order to compare their performances. The choice of the decision trees that will be compared is done based on four different categories of the splitting criteria that are criteria based on information theory, criteria based on distance, statistical-based criteria, and other splitting criteria. These approaches include iterative dichotomizer 3 (first category), C[Formula: see text] (first category), classification and regression trees (second category), Pearson’s correlation coefficient based decision tree (third category), dispersion ratio (third category), and feature weight based decision tree algorithm (last category). On eleven data sets, the six procedures are assessed in terms of classification accuracy, tree depth, leaf nodes, and tree construction time. Furthermore, the Friedman and post hoc Nemenyi tests are used to examine the results that were obtained. The results of these two tests indicate that the iterative dichotomizer 3 and classification and regression trees decision tree methods perform better than the other decision tree methodologies.

Publisher

SAGE Publications

Subject

Applied Mathematics,Computational Mathematics,Numerical Analysis

Link

http://journals.sagepub.com/doi/pdf/10.1177/17483026231198181

Reference24 articles.

1. Induction of decision trees

2. A new criterion in selection and discretization of attributes for the generation of decision trees

3. An Improved Attribute Selection Measure for Decision Tree Induction

4. Brieman L, Friedman J, Stone CJ et al. Classification and regression tree analysis, 1984.