Performance of Machine Learning Algorithms with Different K Values in K-fold CrossValidation-Reference-Cited by-同舟云学术

Performance of Machine Learning Algorithms with Different K Values in K-fold CrossValidation

Published:2021-12-08 Issue:6 Volume:13 Page:61-71
ISSN:2074-9007
Container-title:International Journal of Information Technology and Computer Science
language:
Short-container-title:IJITCS

Author:

Kofi Nti Isaac, ,yarko-Boateng Owusu N,Aning Justice

Abstract

The numerical value of k in a k-fold cross-validation training technique of machine learning predictive models is an essential element that impacts the model’s performance. A right choice of k results in better accuracy, while a poorly chosen value for k might affect the model’s performance. In literature, the most commonly used values of k are five (5) or ten (10), as these two values are believed to give test error rate estimates that suffer neither from extremely high bias nor very high variance. However, there is no formal rule. To the best of our knowledge, few experimental studies attempted to investigate the effect of diverse k values in training different machine learning models. This paper empirically analyses the prevalence and effect of distinct k values (3, 5, 7, 10, 15 and 20) on the validation performance of four well-known machine learning algorithms (Gradient Boosting Machine (GBM), Logistic Regression (LR), Decision Tree (DT) and K-Nearest Neighbours (KNN)). It was observed that the value of k and model validation performance differ from one machine-learning algorithm to another for the same classification task. However, our empirical suggest that k = 7 offers a slight increase in validations accuracy and area under the curve measure with lesser computational complexity than k = 10 across most MLA. We discuss in detail the study outcomes and outline some guidelines for beginners in the machine learning field in selecting the best k value and machine learning algorithm for a given task.

Publisher

MECS Publisher

Link

https://www.mecs-press.org/ijitcs/ijitcs-v13-n6/IJITCS-V13-N6-5.pdf

Cited by 38 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Passive over active: How low-cost strategies influence urban energy equity;Sustainable Cities and Society;2024-11

2. Predicting Steady-State Metabolic Power in Cerebral Palsy, Stroke, and the Elderly During Walking With and Without Assistive Devices;Annals of Biomedical Engineering;2024-09-08

3. Brain tumor detection and segmentation using deep learning;Magnetic Resonance Materials in Physics, Biology and Medicine;2024-09-04

4. High-Accuracy Airborne Rangefinder via Deep Learning Based on Piezoelectric Micromachined Ultrasonic Cantilevers;IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control;2024-09

5. Hybrid modelling of nitrogen removal by biofiltration using high-frequent operational data;Water Science & Technology;2024-08-27