Abstract
Background: Multiple choice questions (MCQs) are commonly used in medical student assessments but often prepared by clinicians without formal education qualifications. This study aimed to inform the question writing process by investigating the association between MCQ characteristics and commonly used statistical measures of individual item quality for a paediatric medical term. Methods: Item characteristics and statistics for five consecutive annual barrier paediatric medical student assessments (each n=60 items) were examined retrospectively. Items were characterised according to format (single best answer vs. extended matching); stem and option length; vignette presence and whether required to answer the question, inclusion of images/tables; clinical skill assessed; paediatric speciality; clinical relevance/applicability; Bloom’s taxonomy domain and item flaws. For each item, we recorded the facility (proportion of students answering correctly) and point biserial (discrimination). Results: Item characteristics significantly positively correlated (p<0.05) with facility were relevant vignette, diagnosis or application items, longer stem length and higher clinical relevance. Recall items (e.g., epidemiology items) were associated with lower facility. Characteristics significantly correlated with higher discrimination were extended matching question (EMQ) format, longer options, diagnostic and subspeciality items. Variation in item characteristics did not predict variation in the facility or point biserial (less than 10% variation explained). Conclusions: Our research supports the use of longer items, relevant vignettes, clinically-relevant content, EMQs and diagnostic items for optimising paediatric MCQ assessment quality. Variation in item characteristics explains a small amount of the observed variation in statistical measures of MCQ quality, highlighting the importance of clinical expertise in writing high quality assessments.