Evaluation methodology for deep learning imputation models-Reference-Cited by-同舟云学术

Evaluation methodology for deep learning imputation models

Published:2022-09-21 Issue:22 Volume:247 Page:1972-1987
ISSN:1535-3702
Container-title:Experimental Biology and Medicine
language:en
Short-container-title:Exp Biol Med (Maywood)

Author:

Boursalie Omar¹²^ORCID,Samavi Reza²³,Doyle Thomas E.¹²⁴

Affiliation:

1. School of Biomedical Engineering, McMaster University, Hamilton, ON L8S 4L8, Canada

2. Vector Institute, Toronto, ON M5G 1M1, Canada

3. Department of Electrical, Computer, and Biomedical Engineering, Toronto Metropolitan University, Toronto, ON M5B 2K3, Canada

4. Department of Electrical and Computer Engineering, McMaster University, Hamilton, ON L8S 4L8, Canada

Abstract

There is growing interest in imputing missing data in tabular datasets using deep learning. Existing deep learning–based imputation models have been commonly evaluated using root mean square error (RMSE) as the predictive accuracy metric. In this article, we investigate the limitations of assessing deep learning–based imputation models by conducting a comparative analysis between RMSE and alternative metrics in the statistical literature including qualitative, predictive accuracy, statistical distance, and descriptive statistics. We design a new aggregated metric, called reconstruction loss (RL), to evaluate deep learning–based imputation models. We also develop and evaluate a novel imputation evaluation methodology based on RL. To minimize model and dataset biases, we use a regression imputation model and two different deep learning imputation models: denoising autoencoders and generative adversarial nets. We also use two tabular datasets from different industry sectors: health care and financial. Our results show that the proposed methodology is effective in evaluating multiple properties of the deep learning–based imputation model’s reconstruction performance.

Funder

Southern Ontario Smart Computing Innovation Platform

Natural Sciences and Engineering Research Council of Canada

Canadian Department of National Defence: Innovation for Defence Excellence & Security Program

Publisher

SAGE Publications

Subject

General Biochemistry, Genetics and Molecular Biology

Link

http://journals.sagepub.com/doi/pdf/10.1177/15353702221121602

Reference45 articles.

1. Inference and missing data

2. Handling incomplete heterogeneous data using VAEs

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Comparison of the performance of multiple imputation models in filling gaps in hourly and daily meteorological series from two locations in the state of São Paulo-Brazil;Modeling Earth Systems and Environment;2023-09-30

2. An Outlier Detection Study of Ozone in Kolkata India by the Classical Statistics, Statistical Process Control and Functional Data Analysis;Sustainability;2023-08-24

3. Which Industrial Sectors Are Affected by Artificial Intelligence? A Bibliometric Analysis of Trends and Perspectives;Sustainability;2023-08-09

4. Handling missing values in healthcare data: A systematic review of deep learning-based imputation techniques;Artificial Intelligence in Medicine;2023-08

5. An Approach Based on Web Scraping and Denoising Encoders to Curate Food Security Datasets;Agriculture;2023-05-06