Mask R-CNN based multiclass segmentation model for endotracheal intubation using video laryngoscope

Author:

Choi Seung Jae1,Kim Dae Kon234,Kim Byeong Soo5,Cho Minwoo1,Jeong Joo23,Jo You Hwan23,Song Kyoung Jun36,Kim Yu Jin23,Kim Sungwan47ORCID

Affiliation:

1. Transdisciplinary Department of Medicine and Advanced Technology, Seoul National University Hospital, Seoul, Republic of Korea

2. Department of Emergency Medicine, Seoul National University Bundang Hospital, Seongnam, Republic of Korea

3. Department of Emergency Medicine, Seoul National University College of Medicine, Seoul, Republic of Korea

4. Department of Biomedical Engineering, Seoul National University College of Medicine, Seoul, Republic of Korea

5. Interdisciplinary Program in Bioengineering, Graduate School, Seoul National University, Seoul, Republic of Korea

6. Department of Emergency Medicine, Seoul Metropolitan Government-Seoul National University Boramae Medical Center, Seoul, Republic of Korea

7. Institute of Bioengineering, Seoul National University, Seoul, Republic of Korea

Abstract

Objective Endotracheal intubation (ETI) is critical to secure the airway in emergent situations. Although artificial intelligence algorithms are frequently used to analyze medical images, their application to evaluating intraoral structures based on images captured during emergent ETI remains limited. The aim of this study is to develop an artificial intelligence model for segmenting structures in the oral cavity using video laryngoscope (VL) images. Methods From 54 VL videos, clinicians manually labeled images that include motion blur, foggy vision, blood, mucus, and vomitus. Anatomical structures of interest included the tongue, epiglottis, vocal cord, and corniculate cartilage. EfficientNet-B5 with DeepLabv3+, EffecientNet-B5 with U-Net, and Configured Mask R-Convolution Neural Network (CNN) were used; EffecientNet-B5 was pretrained on ImageNet. Dice similarity coefficient (DSC) was used to measure the segmentation performance of the model. Accuracy, recall, specificity, and F1 score were used to evaluate the model's performance in targeting the structure from the value of the intersection over union between the ground truth and prediction mask. Results The DSC of tongue, epiglottis, vocal cord, and corniculate cartilage obtained from the EfficientNet-B5 with DeepLabv3+, EfficientNet-B5 with U-Net, and Configured Mask R-CNN model were 0.3351/0.7675/0.766/0.6539, 0.0/0.7581/0.7395/0.6906, and 0.1167/0.7677/0.7207/0.57, respectively. Furthermore, the processing speeds (frames per second) of the three models stood at 3, 24, and 32, respectively. Conclusions The algorithm developed in this study can assist medical providers performing ETI in emergent situations.

Funder

National Research Foundation of Korea

AI Institute at Seoul National University

Publisher

SAGE Publications

Subject

Health Information Management,Computer Science Applications,Health Informatics,Health Policy

Cited by 4 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3