Scoring sleep with artificial intelligence enables quantification of sleep stage ambiguity: hypnodensity based on multiple expert scorers and auto-scoring

Author:

Bakker Jessie P1ORCID,Ross Marco2ORCID,Cerny Andreas2ORCID,Vasko Ray1,Shaw Edmund1,Kuna Samuel34,Magalang Ulysses J5ORCID,Punjabi Naresh M6,Anderer Peter2ORCID

Affiliation:

1. Philips Sleep and Respiratory Care , Pittsburgh, PA, USA

2. Philips Sleep and Respiratory Care , Vienna , Austria

3. Perelman School of Medicine, University of Pennsylvania , Philadelphia, PA, USA

4. Corporal Michael J. Crescenz Veterans Affairs Medical Center , Philadelphia, PA, USA

5. Division of Pulmonary, Critical Care, and Sleep Medicine, Ohio State University Wexner Medical Center , Columbus, OH , USA

6. Division of Pulmonary, Critical Care, and Sleep Medicine, University of Miami , Miami FL , USA

Abstract

Abstract Study Objectives To quantify the amount of sleep stage ambiguity across expert scorers and to validate a new auto-scoring platform against sleep staging performed by multiple scorers. Methods We applied a new auto-scoring system to three datasets containing 95 PSGs scored by 6–12 scorers, to compare sleep stage probabilities (hypnodensity; i.e. the probability of each sleep stage being assigned to a given epoch) as the primary output, as well as a single sleep stage per epoch assigned by hierarchical majority rule. Results The percentage of epochs with 100% agreement across scorers was 46 ± 9%, 38 ± 10% and 32 ± 9% for the datasets with 6, 9, and 12 scorers, respectively. The mean intra-class correlation coefficient between sleep stage probabilities from auto- and manual-scoring was 0.91, representing excellent reliability. Within each dataset, agreement between auto-scoring and consensus manual-scoring was significantly higher than agreement between manual-scoring and consensus manual-scoring (0.78 vs. 0.69; 0.74 vs. 0.67; and 0.75 vs. 0.67; all p < 0.01). Conclusions Analysis of scoring performed by multiple scorers reveals that sleep stage ambiguity is the rule rather than the exception. Probabilities of the sleep stages determined by artificial intelligence auto-scoring provide an excellent estimate of this ambiguity. Compared to consensus manual-scoring, sleep staging derived from auto-scoring is for each individual PSG noninferior to manual-scoring meaning that auto-scoring output is ready for interpretation without the need for manual adjustment.

Publisher

Oxford University Press (OUP)

Subject

Physiology (medical),Neurology (clinical)

Reference54 articles.

1. The AASM manual for the scoring of sleep and associated events;Iber,2007

2. The visual scoring of sleep in adults;Silber;J Clin Sleep Med.,2007

3. Interrater reliability for sleep scoring according to the Rechtschaffen & Kales and the new AASM standard;Danker-Hopfe;J Sleep Res.,2009

4. Interrater agreement between American and Chinese sleep centers according to the 2014 AASM standard;Deng;Sleep Breath.,2019

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3