Abstract
Abstract
Background
Item difficulty plays a crucial role in assessing students’ understanding of the concept being tested. The difficulty of each item needs to be carefully adjusted to ensure the achievement of the evaluation’s objectives. Therefore, this study aimed to investigate whether repeated item development training for medical school faculty improves the accuracy of predicting item difficulty in multiple-choice questions.
Methods
A faculty development program was implemented to enhance the prediction of each item’s difficulty index, ensure the absence of item defects, and maintain the general principles of item development. The interrater reliability between the predicted, actual, and corrected item difficulty was assessed before and after the training, using either the kappa index or the correlation coefficient, depending on the characteristics of the data. A total of 62 faculty members participated in the training. Their predictions of item difficulty were compared with the analysis results of 260 items taken by 119 fourth-year medical students in 2016 and 316 items taken by 125 fourth-year medical students in 2018.
Results
Before the training, significant agreement between the predicted and actual item difficulty indices was observed for only one medical subject, Cardiology (K = 0.106, P = 0.021). However, after the training, significant agreement was noted for four subjects: Internal Medicine (K = 0.092, P = 0.015), Cardiology (K = 0.318, P = 0.021), Neurology (K = 0.400, P = 0.043), and Preventive Medicine (r = 0.577, P = 0.039). Furthermore, a significant agreement was observed between the predicted and actual difficulty indices across all subjects when analyzing the average difficulty of all items (r = 0.144, P = 0.043). Regarding the actual difficulty index by subject, neurology exceeded the desired difficulty range of 0.45–0.75 in 2016. By 2018, however, all subjects fell within this range.
Conclusion
Repeated item development training, which includes predicting each item’s difficulty index, can enhance faculty members’ ability to predict and adjust item difficulty accurately. To ensure that the difficulty of the examination aligns with its intended purpose, item development training can be beneficial. Further studies on faculty development are necessary to explore these benefits more comprehensively.
Publisher
Springer Science and Business Media LLC
Reference37 articles.
1. Ferris H, O’ Flynn D. Assessment in medical education; what are we trying to achieve? Int J High Educ. 2015;4:139–44.
2. Lee GB, Chiu AM. Assessment and feedback methods in competency-based medical education. Ann Allergy Asthma Immunol. 2022;128:256–62.
3. Boud D. Assessment and learning: contradictory or complementary. In: Knight P, editor. Assessment for Learning in Higher Education. London: Kogan Page; 1995. pp. 35–48.
4. Müller S, Settmacher U, Koch I, Dahmen U. A pilot survey of student perceptions on the benefit of the OSCE and MCQ modalities. GMS J Med Educ. 2018;35:Doc51.
5. Herrero JI, Lucena F, Quiroga J. Randomized study showing the benefit of medical study writing multiple choice questions on their learning. BMC Med Educ. 2019;19:42.