Author:
El Touati Yamen,Slimane Jihane Ben,Saidani Taoufik
Abstract
Feature selection is a fundamental aspect of machine learning that is crucial for improving the accuracy and efficiency of models. It carefully analyzes the abundance of data to identify the most significant characteristics, hence improving the accuracy of predictions and minimizing the likelihood of model overfitting. This technique not only optimizes model training by reducing computational requirements, but also enhances the model's interpretability, resulting in more transparent and reliable predictions. The deliberate omission of unnecessary variables is a process of improving the model and also constitutes a crucial measure toward achieving more flexible and comprehensible results in machine learning. An analysis to assess the effectiveness of feature selection on regression models was conducted. The impact was measured using Mean Squared Error (MSE) metrics. A variety of regression algorithms were evaluated, and then feature selection techniques, including statistical and algorithmic methods, such as SelectKBest, PCA, and RFE with Linear Regression and Random Forest, were applied. After selecting the features, linear models demonstrated improvements in mean squared error (MSE), highlighting the value of removing unnecessary data. This study emphasizes the subtle impact of feature selection on model performance, calling for a tailored strategy to maximize prediction accuracy.
Publisher
Engineering, Technology & Applied Science Research