Author:
Burn CC,Pritchard JC,Whay HR
Abstract
AbstractWelfare issues relevant to equids working in developing countries may differ greatly to those of sport and companion equids in developed countries. In this study, we test the observer reliability of a working equine welfare assessment, demonstrating how prevalence of certain observations reduces reliability ratings. The assessment included behaviour, general health, wounds, and limb and foot pathologies. In Study 1, agreement between five observers and their trainer (the ‘gold standard’) was assessed using 80 horses and 80 donkeys in India. Intra-observer agreement was later tested on 40 of each species. Study 2 took place in Egypt, using nine observers, their trainer, 30 horses and 30 donkeys, adjusting some scoring systems and providing observers with more detailed guidelines than in Study 1. Percentage agreements, Fleiss kappa (with a weighted version for ordinal scores) and prevalence indices were calculated for each variable. Reliability was similar across both studies, but was significantly poorer for donkeys than horses. Age, sex, certain wounds and (for horses alone) body condition, consistently attained clinically-useful reliability. Hoofhorn quality, point-of-hock lesions, mucous membrane abnormalities, limb-tether lesions, and skin tenting showed poor reliability. Reporting the prevalence index alongside the percentage agreement showed that, for many variables, the populations were too homogenous for conclusive reliability ratings. Suggestions are made for improving scoring systems showing poor reliability, but future testing will require deliberate selection of a more diverse equine population. This could prove challenging given that, in both populations of horses and donkeys studied here, many pathologies apparently showed 90-100% prevalence.
Publisher
Cambridge University Press (CUP)
Subject
General Veterinary,General Biochemistry, Genetics and Molecular Biology,Animal Science and Zoology
Reference25 articles.
1. The dependence of Cohen's kappa on the prevalence does not matter
2. The Measurement of Observer Agreement for Categorical Data
3. Burn, CC and Weir, AAS Using prevalence indices to aid interpretation and comparison of agreement ratings between two or more observers. The Veterinary Journal, submitted
4. The Kappa Statistic in Reliability Studies: Use, Interpretation, and Sample Size Requirements
5. Pritchard, JC and Whay, HR 2004 Guidance notes to accompany working equine welfare assessment. University of Bristol: Bristol, UK, unpublished