Comparative Evaluation and Comprehensive Analysis of Machine Learning Models for Regression Problems

Author:

Sekeroglu Boran12,Ever Yoney Kirsal32,Dimililer Kamil42,Al-Turjman Fadi52

Affiliation:

1. Information Systems Engineering Department, Near East University, Nicosia, Cyprus, Mersin 10, Turkey

2. Research Centre for AI and IoT, Near East University, Nicosia, Cyprus, Mersin 10, Turkey

3. Software Engineering Department, Near East University, Nicosia, Cyprus, Mersin 10, Turkey

4. Electrical and Electronic Engineering Department, Near East University, Nicosia, Cyprus, Mersin 10, Turkey

5. Artificial Intelligence Engineering Department, Near East University, Nicosia, Cyprus, Mersin 10, Turkey

Abstract

Abstract Artificial intelligence and machine learning applications are of significant importance almost in every field of human life to solve problems or support human experts. However, the determination of the machine learning model to achieve a superior result for a particular problem within the wide real-life application areas is still a challenging task for researchers. The success of a model could be affected by several factors such as dataset characteristics, training strategy and model responses. Therefore, a comprehensive analysis is required to determine model ability and the efficiency of the considered strategies. This study implemented ten benchmark machine learning models on seventeen varied datasets. Experiments are performed using four different training strategies 60:40, 70:30, and 80:20 hold-out and five-fold cross-validation techniques. We used three evaluation metrics to evaluate the experimental results: mean squared error, mean absolute error, and coefficient of determination (R2 score). The considered models are analyzed, and each model's advantages, disadvantages, and data dependencies are indicated. As a result of performed excess number of experiments, the deep Long-Short Term Memory (LSTM) neural network outperformed other considered models, namely, decision tree, linear regression, support vector regression with a linear and radial basis function kernels, random forest, gradient boosting, extreme gradient boosting, shallow neural network, and deep neural network. It has also been shown that cross-validation has a tremendous impact on the results of the experiments and should be considered for the model evaluation in regression studies where data mining or selection is not performed.

Publisher

MIT Press

Subject

Artificial Intelligence,Library and Information Sciences,Computer Science Applications,Information Systems

Reference33 articles.

1. Comparison of Machine Learning Techniques for Prediction Problems;Ever;In Advances in Intelligent Systems and Computing,2019

2. Prediction of cancer incidence rates for the European continent using machine learning models;Sekeroglu;Health Informatics Journal,2021

3. CovidGAN: Data Augmentation Using Auxiliary Classifier GAN for Improved Covid-19 Detection;Waheed;IEEE Access,2020

4. Decision trees for predicting the academic success of students;Mesaric;Croatian Operational Research Review,2016

5. Stock price prediction using back propagation neural network based on gradient descent with momentum and adaptive learning rate;Utomo;Journal of Internet Banking and Commerce,2017

Cited by 14 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3