DEGAIN: Generative-Adversarial-Network-Based Missing Data Imputation-Reference-Cited by-同舟云学术

DEGAIN: Generative-Adversarial-Network-Based Missing Data Imputation

Published:2022-12-12 Issue:12 Volume:13 Page:575
ISSN:2078-2489
Container-title:Information
language:en
Short-container-title:Information

Author:

Shahbazian Reza^ORCID,Trubitsyna Irina^ORCID

Abstract

Insights and analysis are only as good as the available data. Data cleaning is one of the most important steps to create quality data decision making. Machine learning (ML) helps deal with data quickly, and to create error-free or limited-error datasets. One of the quality standards for cleaning the data includes handling the missing data, also known as data imputation. This research focuses on the use of machine learning methods to deal with missing data. In particular, we propose a generative adversarial network (GAN) based model called DEGAIN to estimate the missing values in the dataset. We evaluate the performance of the presented method and compare the results with some of the existing methods on publicly available Letter Recognition and SPAM datasets. The Letter dataset consists of 20,000 samples and 16 input features and the SPAM dataset consists of 4601 samples and 57 input features. The results show that the proposed DEGAIN outperforms the existing ones in terms of root mean square error and Frechet inception distance metrics.

Funder

MISE Project True Detective 4.0

Publisher

MDPI AG

Subject

Information Systems

Link

https://www.mdpi.com/2078-2489/13/12/575/pdf

Reference41 articles.

1. Ilyas, I.F., and Chu, X. (2019). Data Cleaning, Morgan & Claypool.

2. Yes, you can import, analyze, and create dashboards and storyboards in Tableau! The GBI case;Stone;J. Emerg. Technol. Account.,2020

3. Evaluating the state of the art in missing data imputation for clinical data;Luo;Briefings Bioinform.,2022