Complex Data Imputation by Auto-Encoders and Convolutional Neural Networks—A Case Study on Genome Gap-Filling-Reference-Cited by-同舟云学术

Complex Data Imputation by Auto-Encoders and Convolutional Neural Networks—A Case Study on Genome Gap-Filling

Published:2020-05-11 Issue:2 Volume:9 Page:37
ISSN:2073-431X
Container-title:Computers
language:en
Short-container-title:Computers

Author:

Cappelletti Luca^ORCID,Fontana Tommaso^ORCID,Di Donato Guido Walter,Di Tucci Lorenzo,Casiraghi Elena^ORCID,Valentini Giorgio^ORCID

Abstract

Missing data imputation has been a hot topic in the past decade, and many state-of-the-art works have been presented to propose novel, interesting solutions that have been applied in a variety of fields. In the past decade, the successful results achieved by deep learning techniques have opened the way to their application for solving difficult problems where human skill is not able to provide a reliable solution. Not surprisingly, some deep learners, mainly exploiting encoder-decoder architectures, have also been designed and applied to the task of missing data imputation. However, most of the proposed imputation techniques have not been designed to tackle “complex data”, that is high dimensional data belonging to datasets with huge cardinality and describing complex problems. Precisely, they often need critical parameters to be manually set or exploit complex architecture and/or training phases that make their computational load impracticable. In this paper, after clustering the state-of-the-art imputation techniques into three broad categories, we briefly review the most representative methods and then describe our data imputation proposals, which exploit deep learning techniques specifically designed to handle complex data. Comparative tests on genome sequences show that our deep learning imputers outperform the state-of-the-art KNN-imputation method when filling gaps in human genome sequences.

Publisher

MDPI AG

Subject

Computer Networks and Communications,Human-Computer Interaction

Link

https://www.mdpi.com/2073-431X/9/2/37/pdf

Reference88 articles.

1. A Survey on Data Imputation Techniques: Water Distribution System as a Use Case

2. Pattern classification with missing data: a review

3. Missing value imputation on missing completely at random data using multilayer perceptrons

4. The nature of sensitivity in monotone missing not at random models

5. A Fast and Flexible Statistical Model for Large-Scale Population Genotype Data: Applications to Inferring Missing Genotypes and Haplotypic Phase

Cited by 12 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. XU-NetI: Simple U-Shaped Encoder-Decoder Network for Accurate Imputation of Multivariate Missing Data;Franklin Open;2024-08

2. Integrative Analysis of Genomic Data Types and AI Methodologies in Healthcare Applications;2024 2nd International Conference on Cyber Resilience (ICCR);2024-02-26

4. XU-NetI: Simple U-Shaped Encoder-Decoder Network for Accurate Imputation of Multivariate Missing Data;2023-08-07

5. A systematic review of generative adversarial imputation network in missing data imputation;Neural Computing and Applications;2023-07-21