Symbolic regression as a feature engineering method for machine and deep learning regression tasks-Reference-Cited by-同舟云学术

Symbolic regression as a feature engineering method for machine and deep learning regression tasks

Published:2024-06-01 Issue:2 Volume:5 Page:025065
ISSN:2632-2153
Container-title:Machine Learning: Science and Technology
language:
Short-container-title:Mach. Learn.: Sci. Technol.

Author:

Shmuel Assaf^ORCID,Glickman Oren^ORCID,Lazebnik Teddy^ORCID

Abstract

Abstract In the realm of machine and deep learning (DL) regression tasks, the role of effective feature engineering (FE) is pivotal in enhancing model performance. Traditional approaches of FE often rely on domain expertise to manually design features for machine learning (ML) models. In the context of DL models, the FE is embedded in the neural network’s architecture, making it hard for interpretation. In this study, we propose to integrate symbolic regression (SR) as an FE process before a ML model to improve its performance. We show, through extensive experimentation on synthetic and 21 real-world datasets, that the incorporation of SR-derived features significantly enhances the predictive capabilities of both machine and DL regression models with 34%–86% root mean square error (RMSE) improvement in synthetic datasets and 4%–11.5% improvement in real-world datasets. In an additional realistic use case, we show the proposed method improves the ML performance in predicting superconducting critical temperatures based on Eliashberg theory by more than 20% in terms of RMSE. These results outline the potential of SR as an FE component in data-driven models, improving them in terms of performance and interpretability.

Publisher

IOP Publishing

Link

https://iopscience.iop.org/article/10.1088/2632-2153/ad513a/pdf

Reference91 articles.

1. Deep learning in fluid dynamics;Kutz;J. Fluid Mech.,2017

2. Deep learning and process understanding for data-driven earth system science;Reichstein;Nature,2019

3. Review of deep learning: concepts, cnn architectures, challenges, applications, future directions;Alzubaidi;J. Big Data,2021

4. Hidden physics models: machine learning of nonlinear partial differential equations;Raissi;J. Comput. Phys.,2018

5. Machine learning for the prediction of pseudorealistic pediatric abdominal phantoms for radiation dose reconstruction;Virgolin;J. Med. Imaging,2020

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Neural network integrated with symbolic regression for multiaxial fatigue life prediction;International Journal of Fatigue;2024-11

2. Machine learning computational model to predict lung cancer using electronic medical records;Cancer Epidemiology;2024-10