Imputation of missing values for cochlear implant candidate audiometric data and potential applications-Reference-Cited by-同舟云学术

Imputation of missing values for cochlear implant candidate audiometric data and potential applications

Published:2023-02-06 Issue:2 Volume:18 Page:e0281337
ISSN:1932-6203
Container-title:PLOS ONE
language:en
Short-container-title:PLoS ONE

Author:

Pavelchek Cole^ORCID,Michelson Andrew P.^ORCID,Walia Amit,Ortmann Amanda,Herzog Jacques,Buchman Craig A.,Shew Matthew A.^ORCID

Abstract

Objective Assess the real-world performance of popular imputation algorithms on cochlear implant (CI) candidate audiometric data. Methods 7,451 audiograms from patients undergoing CI candidacy evaluation were pooled from 32 institutions with complete case analysis yielding 1,304 audiograms. Imputation model performance was assessed with nested cross-validation on randomly generated sparse datasets with various amounts of missing data, distributions of sparsity, and dataset sizes. A threshold for safe imputation was defined as root mean square error (RMSE) <10dB. Models included univariate imputation, interpolation, multiple imputation by chained equations (MICE), k-nearest neighbors, gradient boosted trees, and neural networks. Results Greater quantities of missing data were associated with worse performance. Sparsity in audiometric data is not uniformly distributed, as inter-octave frequencies are less commonly tested. With 3–8 missing features per instance, a real-world sparsity distribution was associated with significantly better performance compared to other sparsity distributions (Δ RMSE 0.3 dB– 5.8 dB, non-overlapping 99% confidence intervals). With a real-world sparsity distribution, models were able to safely impute up to 6 missing datapoints in an 11-frequency audiogram. MICE consistently outperformed other models across all metrics and sparsity distributions (p < 0.01, Wilcoxon rank sum test). With sparsity capped at 6 missing features per audiogram but otherwise equivalent to the raw dataset, MICE imputed with RMSE of 7.83 dB [95% CI 7.81–7.86]. Imputing up to 6 missing features captures 99.3% of the audiograms in our dataset, allowing for a 5.7-fold increase in dataset size (1,304 to 7,399 audiograms) as compared with complete case analysis. Conclusion Precision medicine will inevitably play an integral role in the future of hearing healthcare. These methods are data dependent, and rigorously validated imputation models are a key tool for maximizing datasets. Using the largest CI audiogram dataset to-date, we demonstrate that in a real-world scenario MICE can safely impute missing data for the vast majority (>99%) of audiograms with RMSE well below a clinically significant threshold of 10dB. Evaluation across a range of dataset sizes and sparsity distributions suggests a high degree of generalizability to future applications.

Publisher

Public Library of Science (PLoS)

Subject

Multidisciplinary

Reference28 articles.

1. Cochlear implantation outcomes in adults: A scoping review;I Boisvert;PLoS One,2020

2. Prediction models for clinical outcome after cochlear implantation: a systematic review;HM Velde;J Clin Epidemiol,2021

3. Missing data is poorly handled and reported in prediction model studies using machine learning: a literature review;S Nijman;J Clin Epidemiol,2022

4. Review: a gentle introduction to imputation of missing values;AR Donders;J Clin Epidemiol,2006

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Artificial Intelligence in Otology and Neurotology;Otolaryngologic Clinics of North America;2024-10

2. Individual Patient Comorbidities and Effect on Cochlear Implant Performance;Otology & Neurotology;2024-02-29

3. Artificial Intelligence for Cochlear Implants: Review of Strategies, Challenges, and Perspectives;IEEE Access;2024

4. Current big data approaches to clinical questions in otolaryngology;Big Data in Otolaryngology;2024