Affiliation:
1. LPAIS Laboratory , Faculty of Sciences, USMBA Fez , Morocco
2. Faculty of sciences , UAE , Tetouan , Morocco
Abstract
Abstract
Decision trees are among the most popular classifiers in machine learning, artificial intelligence, and pattern recognition because they are accurate and easy to interpret. During the tree construction, a node containing too few observations (weak node) could still get split, and then the resulted split is unreliable and statistically has no value. Many existing machine-learning methods can resolve this issue, such as pruning, which removes the tree’s non-meaningful parts. This paper deals with the weak nodes differently; we introduce a new algorithm Enhancing Weak Nodes in Decision Tree (EWNDT), which reinforces them by increasing their data from other similar tree nodes. We called the data augmentation a virtual merging because we temporarily recalculate the best splitting attribute and the best threshold in the weak node. We have used two approaches to defining the similarity between two nodes. The experimental results are verified using benchmark datasets from the UCI machine-learning repository. The results indicate that the EWNDT algorithm gives a good performance.
Reference27 articles.
1. 1. Breiman, L., Je. Friedman, C. J. Stone, R. A. Olshen. Classification and Regression Trees. CRC Press, 1984.
2. 2. Joost de Nijs. Decision Dags – a New Approach. Drown University, 1999.
3. 3. Hu, D., Q. Liu, Q. Yan. Decision Tree Merging Branches Algorithm Based on Equal Predictability. – In: Proc. of International Conference on Artificial Intelligence and Computational Intelligence, Vol. 3, 2009, pp. 214-218.10.1109/AICI.2009.80
4. 4. Ignatov, D., A. Ignatov. Decision Stream: Cultivating Deep Decision Trees. – In: Proc. of 29th IEEE International Conference on Tools with Artificial Intelligence (ICTAI’17), 2017, pp. 905-912.10.1109/ICTAI.2017.00140
5. 5. Gordon, V. Kass. An Exploratory Technique for Investigating Large Quantities of Categorical Data. – Journal of the Royal Statistical Society: Series C (Applied Statistics), Vol. 29, 1980, No 2, pp. 119-127.10.2307/2986296
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献