Author:
Binol Hamidullah,Moberly Aaron C.,Niazi M. Khalid Khan,Essig Garth,Shah Jay,Elmaraghy Charles,Teknos Theodoros,Taj-Schaal Nazhat,Yu Lianbo,Gurcan Metin N.
Abstract
AbstractBackground and ObjectiveThe aim of this study is to develop and validate an automated image segmentation-based frame selection and stitching framework to create enhanced composite images from otoscope videos. The proposed framework, called SelectStitch, is useful for classifying eardrum abnormalities using a single composite image instead of the entire raw otoscope video dataset.MethodsSelectStitch consists of a convolutional neural network (CNN) based semantic segmentation approach to detect the eardrum in each frame of the otoscope video, and a stitching engine to generate a high-quality composite image from the detected eardrum regions. In this study, we utilize two separate datasets: the first one has 36 otoscope videos that were used to train a semantic segmentation model, and the second one, containing 100 videos, which was used to test the proposed method. Cases from both adult and pediatric patients were used in this study. A configuration of 4-levels depth U-Net architecture was trained to automatically find eardrum regions in each otoscope video frame from the first dataset. After the segmentation, we automatically selected meaningful frames from otoscope videos by using a pre-defined threshold, i.e., it should contain at least an eardrum region of 20% of a frame size. We have generated 100 composite images from the test dataset. Three ear, nose, and throat (ENT) specialists (ENT-I, ENT-II, ENT-III) compared in two rounds the composite images produced by SelectStitch against the composite images that were generated by the base processes, i.e., stitching all the frames from the same video data, in terms of their diagnostic capabilities.ResultsIn the first round of the study, ENT-I, ENT-II, ENT-III graded improvement for 58, 57, and 71 composite images out of 100, respectively, for SelectStitch over the base composite, reflecting greater diagnostic capabilities. In the repeat assessment, these numbers were 56, 56, and 64, respectively. We observed that only 6%, 3%, and 3% of the cases received a lesser score than the base composite images, respectively, for ENT-I, ENT-II, and ENT-III in Round-1, and 4%, 0%, and 2% of the cases in Round-2.ConclusionsFrame selection improves the diagnostic quality of composite images from otoscope video clips.
Publisher
Cold Spring Harbor Laboratory
Reference33 articles.
1. A mosaicking approach for in vivo thickness mapping of the human tympanic membrane using low coherence interferometry;Journal of the Association for Research in Otolaryngology,2016
2. Noninvasive in vivo optical coherence tomography tracking of chronic otitis media in pediatric subjects after surgical intervention;Journal of biomedical optics,2017
3. Video pneumatic otoscopy for the diagnosis of otitis media with effusion: a quantitative approach
4. Automated diagnosis of otitis media: vocabulary and grammar;Journal of Biomedical Imaging,2013