Synthetic Data Generation for Data Envelopment Analysis-Reference-Cited by-同舟云学术

Synthetic Data Generation for Data Envelopment Analysis

Published:2023-09-27 Issue:10 Volume:8 Page:146
ISSN:2306-5729
Container-title:Data
language:en
Short-container-title:Data

Author:

Lychev Andrey V.¹^ORCID

Affiliation:

1. College of Information Technologies and Computer Sciences, National University of Science and Technology “MISIS”, 4 Leninsky Ave., Bldg. 1, 119049 Moscow, Russia

Abstract

The paper is devoted to the problem of generating artificial datasets for data envelopment analysis (DEA), which can be used for testing DEA models and methods. In particular, the papers that applied DEA to big data often used synthetic data generation to obtain large-scale datasets because real datasets of large size, available in the public domain, are extremely rare. This paper proposes the algorithm which takes as input some real dataset and complements it by artificial efficient and inefficient units. The generation process extends the efficient part of the frontier by inserting artificial efficient units, keeping the original efficient frontier unchanged. For this purpose, the algorithm uses the assurance region method and consistently relaxes weight restrictions during the iterations. This approach produces synthetic datasets that are closer to real ones, compared to other algorithms that generate data from scratch. The proposed algorithm is applied to a pair of small real-life datasets. As a result, the datasets were expanded to 50K units. Computational experiments show that artificially generated DMUs preserve isotonicity and do not increase the collinearity of the original data as a whole.

Funder

Russian Science Foundation

Publisher

MDPI AG

Subject

Information Systems and Management,Computer Science Applications,Information Systems

Link

https://www.mdpi.com/2306-5729/8/10/146/pdf

Reference77 articles.

1. Cooper, W.W., Seiford, L.M., and Tone, K. (2007). Data Envelopment Analysis. A Comprehensive Text with Models, Applications, References and DEA-Solver Software, Springer Science and Business Media. [2nd ed.].

2. Finding projection in the two-stage supply chain in DEA-R with random data using (CRA) model;Mozaffari;Big Data Comput. Visions,2021

3. Comparison of Banks and Ranking of Bank Loans Types on Based of Efficiency with Dea in Iran;Fallah;Big Data Comput. Visions,2021

4. A Novel Two-Stage DEA Model in Fuzzy Environment: Application to Industrial Workshops Performance Measurement;Soltani;Int. J. Comput. Intell. Syst.,2020

5. Evaluating the efficiency of power companies using data envelopment analysis based on SBM models: A case study in power industry of Iran;Alavidoost;J. Appl. Res. Ind. Eng.,2018

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Synthetic Data Generation;Advances in Business Information Systems and Analytics;2024-01-16