Deep Learning-Based Detection of Glottis Segmentation Failures-Reference-Cited by-同舟云学术

Deep Learning-Based Detection of Glottis Segmentation Failures

Published:2024-04-30 Issue:5 Volume:11 Page:443
ISSN:2306-5354
Container-title:Bioengineering
language:en
Short-container-title:Bioengineering

Author:

Dadras Armin A.¹^ORCID,Aichinger Philipp¹^ORCID

Affiliation:

1. Speech and Hearing Science Lab, Division of Phoniatrics-Logopedics, Department of Otorhinolaryngology, Medical University of Vienna, Währinger Gürtel 18-20, 1090 Vienna, Austria

Abstract

Medical image segmentation is crucial for clinical applications, but challenges persist due to noise and variability. In particular, accurate glottis segmentation from high-speed videos is vital for voice research and diagnostics. Manual searching for failed segmentations is labor-intensive, prompting interest in automated methods. This paper proposes the first deep learning approach for detecting faulty glottis segmentations. For this purpose, faulty segmentations are generated by applying both a poorly performing neural network and perturbation procedures to three public datasets. Heavy data augmentations are added to the input until the neural network’s performance decreases to the desired mean intersection over union (IoU). Likewise, the perturbation procedure involves a series of image transformations to the original ground truth segmentations in a randomized manner. These data are then used to train a ResNet18 neural network with custom loss functions to predict the IoU scores of faulty segmentations. This value is then thresholded with a fixed IoU of 0.6 for classification, thereby achieving 88.27% classification accuracy with 91.54% specificity. Experimental results demonstrate the effectiveness of the presented approach. Contributions include: (i) a knowledge-driven perturbation procedure, (ii) a deep learning framework for scoring and detecting faulty glottis segmentations, and (iii) an evaluation of custom loss functions.

Funder

Austrian Science Fund

Publisher

MDPI AG

Link

https://www.mdpi.com/2306-5354/11/5/443/pdf

Reference34 articles.

1. Objective measures of laryngeal imaging: What have we learned since Dr. Paul Moore;Woo;J. Voice,2014

2. Andrade-Miranda, G., Stylianou, Y., Deliyski, D.D., Godino-Llorente, J.I., and Henrich Bernardoni, N. (2020). Laryngeal image processing of vocal folds motion. Appl. Sci., 10.

3. Gonzalez, C., Gotkowski, K., Bucher, A., Fischbach, R., Kaltenborn, I., and Mukhopadhyay, A. (October, January 27). Detecting when pre-trained nnu-net models fail silently for covid-19 lung lesion segmentation. Proceedings of the Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France. Proceedings, Part VII 24.

4. Hendrycks, D., Mazeika, M., and Dietterich, T. (2018). Deep anomaly detection with outlier exposure. arXiv.

5. A review of uncertainty quantification in deep learning: Techniques, applications and challenges;Abdar;Inf. Fusion,2021