Author:
Khoubrane Yousef,Ramli Noor Asiah,Mohd Khairi Siti Shaliza
Abstract
Data Envelopment Analysis (DEA) is a well-established non-parametric technique for performance measurement to access the efficiency of Decision-Making Units (DMUs). However, its inability to predict the efficiency values of new DMUs without re-conducting the analysis on the entire dataset has led to the integration of Machine Learning (ML) in previous studies to address this limitation. Yet, such integration often lacks a thorough evaluation of ML's adaptability in replacing current DEA process. This paper presents the results of an empirical study that employed eight ML models, two DEA variants, and a dataset of S&P500 companies. The findings demonstrated ML’s remarkable precision in predicting efficiency values derived from a single DEA run and comparable performance in predicting the efficiency of new DMUs, thus eliminating the need for repeated DEA. This discovery highlights ML’s robustness as a complementary tool for DEA in continuous efficiency estimation, rendering the practice of re-conducting DEA unnecessary. Notably, boosting models within the Ensemble Learning category consistently outperformed other models, highlighting their effectiveness in the context of DEA efficiency prediction. Particularly, CatBoost demonstrated its superiority as the top-performing model, followed by LightGBM in the second position in most cases. When extended to five enlarged datasets, it shows that the model exhibits superior R² values in the CRS scenario.