Evaluation of Regression Models: Model Assessment, Model Selection and Generalization Error-Reference-Cited by-同舟云学术

Evaluation of Regression Models: Model Assessment, Model Selection and Generalization Error

Published:2019-03-22 Issue:1 Volume:1 Page:521-551
ISSN:2504-4990
Container-title:Machine Learning and Knowledge Extraction
language:en
Short-container-title:MAKE

Author:

Emmert-Streib Frank^ORCID,Dehmer Matthias

Abstract

When performing a regression or classification analysis, one needs to specify a statistical model. This model should avoid the overfitting and underfitting of data, and achieve a low generalization error that characterizes its prediction performance. In order to identify such a model, one needs to decide which model to select from candidate model families based on performance evaluations. In this paper, we review the theoretical framework of model selection and model assessment, including error-complexity curves, the bias-variance tradeoff, and learning curves for evaluating statistical models. We discuss criterion-based, step-wise selection procedures and resampling methods for model selection, whereas cross-validation provides the most simple and generic means for computationally estimating all required entities. To make the theoretical concepts transparent, we present worked examples for linear regression models. However, our conceptual presentation is extensible to more general models, as well as classification problems.

Publisher

MDPI AG

Subject

General Economics, Econometrics and Finance

Link

https://www.mdpi.com/2504-4990/1/1/32/pdf

Reference83 articles.

1. Understanding the paradigm shift to computational social science in the presence of big data

2. Data Science and its Relationship to Big Data and Data-Driven Decision Making

3. Data Science in Statistics Curricula: Preparing Students to “Think with Data”

4. The Process of Analyzing Data is the Emergent Feature of Data Science

5. Defining Data Science by a Data-Driven Quantification of the Community

Cited by 67 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Compressive strength prediction of cement base under sulfate attack by machine learning approach;Case Studies in Construction Materials;2024-12

2. Predictive modeling of patulin accumulation in apple lesions infected by Penicillium expansum using machine learning;Postharvest Biology and Technology;2024-11

3. Viability of high-frequency environmental DNA (eDNA) sampling as a fish enumeration tool;Ecological Indicators;2024-09

4. Optimizing arsenic removal from groundwater using continuous flow electrocoagulation with iron and aluminum electrodes: An experimental and modeling approach;Journal of Water Process Engineering;2024-09

5. A Novel Hybrid Regression Model for Banking Loss Estimation;Bingöl Üniversitesi İktisadi ve İdari Bilimler Fakültesi Dergisi;2024-06-27