Inter-rater reliability of risk of bias tools for non-randomized studies

Author:

Kalaycioglu IsabelORCID,Rioux Bastien,Briard Joel Neves,Nehme Ahmad,Touma Lahoud,Dansereau Bénédicte,Veilleux-Carpentier Ariane,Keezer Mark R.

Abstract

Abstract Purpose There is limited knowledge on the reliability of risk of bias (ROB) tools for assessing internal validity in systematic reviews of exposure and frequency studies. We aimed to identify and then compare the inter-rater reliability (IRR) of six commonly used tools for frequency (Loney scale, Gyorkos checklist, American Academy of Neurology [AAN] tool) and exposure (Newcastle–Ottawa scale, SIGN50 checklist, AAN tool) studies. Methods Six raters independently assessed the ROB of 30 frequency and 30 exposure studies using the three respective ROB tools. Articles were rated as low, intermediate, or high ROB. We calculated an intraclass correlation coefficient (ICC) for each tool and category of ROB tool. We compared the IRR between ROB tools and tool type by inspection of overlapping ICC 95% CIs and by comparing their coefficients after transformation to Fisher’s Z values. We assessed the criterion validity of the AAN ROB tools by calculating an ICC for each rater in comparison with the original ratings from the AAN. Results All individual ROB tools had an IRR in the substantial range or higher (ICC point estimates between 0.61 and 0.80). The IRR was almost perfect (ICC point estimate > 0.80) for the AAN frequency tool and the SIGN50 checklist. All tools were comparable in IRR, except for the AAN frequency tool which had a significantly higher ICC than the Gyorkos checklist (p = 0.021) and trended towards a higher ICC when compared to the Loney scale (p = 0.085). When examined by category of ROB tool, scales, and checklists had a substantial IRR, whereas the AAN tools had an almost perfect IRR. For the criterion validity of the AAN ROB tools, the average agreement between our raters and the original AAN ratings was moderate. Conclusion All tools had substantial IRRs except for the AAN frequency tool and the SIGN50 checklist, which both had an almost perfect IRR. The AAN ROB tools were the only category of ROB tools to demonstrate an almost perfect IRR. This category of ROB tools had fewer and simpler criteria. Overall, parsimonious tools with clear instructions, such as those from the AAN, may provide more reliable ROB assessments.

Publisher

Springer Science and Business Media LLC

Subject

Medicine (miscellaneous)

Reference25 articles.

1. Sanderson S, Tatt ID, Higgins JP. Tools for assessing quality and susceptibility to bias in observational studies in epidemiology: a systematic review and annotated bibliography. Int J Epidemiol. 2007;36(3):666–76.

2. Barnish MS, Turner S. The value of pragmatic and observational studies in health care and public health. Pragmat Obs Res. 2017;8:49–55.

3. Munnangi S, Boktor SW. Epidemiology of study design. Treasure Island: StatPearls; 2022.

4. Lee TA, Pickard AS. Exposure Definition and Measurement. In: Velentgas P, Dreyer NA, Nourjah P, Smith SR, Torchia MM, editors. Developing a Protocol for Observational Comparative Effectiveness Research: a User’s Guide. AHRQ Publication No. 12(13)-EHC099. Rockville: Agency for Healthcare Research and Quality; 2013.

5. American Academy of Neurology. Guideline development procedure manual. American Academy of Neurology; c2017 [cited 2023 Jul 6].

Cited by 3 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3