Data variability in the imputation quality of missing data-Reference-Cited by-同舟云学术

Data variability in the imputation quality of missing data

Published:2024-04-03 Issue:1 Volume:46 Page:e66185
ISSN:1807-8621
Container-title:Acta Scientiarum. Agronomy
language:
Short-container-title:Acta Sci. Agron.

Author:

Stochero Elisandra Lúcia Moro^ORCID,Dal'Col Lúcio Alessandro^ORCID,Jacobi Luciane Flores^ORCID

Abstract

Imputation methods were developed to define estimates for missing data and hence solve possible problems generated by the loss of this information. This study aims to assess whether data variability influences the results obtained after applying an imputation method. Incomplete databases were generated from complete real databases of experiments of tomato plants conducted using the randomized block design with three replications and 12 treatments by removing different amounts of data. The evaluated variables consisted of fruit weight per plant, number of fruits per plant, and average fruit length and width, forming eight balanced databases. Subsequently, the distribution-free multiple imputation method was applied, generating complete databases from imputation. The number of missing information influenced the accuracy measures for the data in this study. Data imputation was inadequate when there was high variability but more precise and accurate in cases of low variability. It confirmed the importance of assessing data variability before choosing to apply the imputation method.

Publisher

Universidade Estadual de Maringa

Reference28 articles.

1. Austin, P. C., White, I. R., Lee, D. S., & van Buuren, S. (2021). Missing data in clinical research: A tutorial on multiple imputation. Canadian Journal of Cardiology, 37(9), 1322-1331 DOI: https://doi.org/10.1016/j.cjca.2020.11.010

2. Banzatto, D. A., & Kronka, S. N. (2013). Experimentação agrícola (4. ed.). Jaboticabal, SP: Funep.

3. Bergamo, G. C., Dias, C. T. S., & Krzanowski, W. J. (2008). Distribuition-free multiple imputation in an interaction matrix through singular value decomposition. Scientia Agricola, 65(4), 422-427. DOI: https://doi.org/10.1590/S0103-90162008000400015

4. Bleidorn, M. T., Pinto, W. P., Schmidt, I. M., Mendonça, A. S. F., & Reis, J. A. T. (2022). Methodological approaches for imputing missing data into monthly flows series. Revista Ambiente & Água, 17(2), 1-27. DOI: https://doi.org/10.4136/ambi-agua.2795

5. Boomgard-Zagrodnik, J. P., & Brown, D. J. (2022). Machine learning imputation of missing Mesonet temperature observations. Computers and Electronics in Agriculture, 192, 106580. DOI: https://doi.org/10.1016/j.compag.2021.106580