Affiliation:
1. Department of Psychological Sciences, Texas Tech University
Abstract
Interrater reliability (IRR) assesses the stability of a coding protocol over time and across coders. For practical reasons, it is often difficult to assess IRR for an entire dataset, so researchers sometimes calculate the IRR for a subset of the total data sample. The purpose of this study is to investigate the accuracy of such subset IRRs. Using bootstrapping, we determined the effects of sample size (10%, 25%, & 40% of the total dataset) and IRR measure type (percent agreement, Krippendorff’s alpha, & the G Index) on the bias and percent error of subset IRRs. Results support the use of calculating IRR from subsets of the total data sample, though we discuss how the accuracy of subset IRR values may depend on aspects of the dataset such as total sample size and coding methodology.
Subject
General Medicine,General Chemistry
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献