Affiliation:
1. Center for Innovation to Implementation VA Palo Alto Health Care System Menlo Park California
2. Stanford‐Surgery Policy Improvement Research and Education Center, Department of Surgery Stanford University Stanford California
Abstract
Quality measurement plays an increasing role in U.S. health care. Measures inform quality improvement efforts, public reporting of variations in quality of care across providers and hospitals, and high‐stakes financial decisions. To be meaningful in these contexts, measures should be reliable and not heavily impacted by chance variations in sampling or measurement. Several different methods are used in practice by measure developers and endorsers to evaluate reliability; however, there is uncertainty and debate over differences between these methods and their interpretations. We review methods currently used in practice, pointing out differences that can lead to disparate reliability estimates. We compare estimates from 14 different methods in the case of two sets of mental health quality measures within a large health system. We find that estimates can differ substantially and that these discrepancies widen when sample size is reduced.
Funder
U.S. Department of Veterans Affairs