The impact of repeated item development training on the prediction of medical faculty members’ item difficulty index-Reference-Cited by-同舟云学术

The impact of repeated item development training on the prediction of medical faculty members’ item difficulty index

Published:2024-05-30 Issue:1 Volume:24 Page:
ISSN:1472-6920
Container-title:BMC Medical Education
language:en
Short-container-title:BMC Med Educ

Author:

Lee Hye Yoon^ORCID,Yune So Jung^ORCID,Lee Sang Yeoup^ORCID,Im Sunju^ORCID,Kam Bee Sung^ORCID

Abstract

Abstract Background Item difficulty plays a crucial role in assessing students’ understanding of the concept being tested. The difficulty of each item needs to be carefully adjusted to ensure the achievement of the evaluation’s objectives. Therefore, this study aimed to investigate whether repeated item development training for medical school faculty improves the accuracy of predicting item difficulty in multiple-choice questions. Methods A faculty development program was implemented to enhance the prediction of each item’s difficulty index, ensure the absence of item defects, and maintain the general principles of item development. The interrater reliability between the predicted, actual, and corrected item difficulty was assessed before and after the training, using either the kappa index or the correlation coefficient, depending on the characteristics of the data. A total of 62 faculty members participated in the training. Their predictions of item difficulty were compared with the analysis results of 260 items taken by 119 fourth-year medical students in 2016 and 316 items taken by 125 fourth-year medical students in 2018. Results Before the training, significant agreement between the predicted and actual item difficulty indices was observed for only one medical subject, Cardiology (K = 0.106, P = 0.021). However, after the training, significant agreement was noted for four subjects: Internal Medicine (K = 0.092, P = 0.015), Cardiology (K = 0.318, P = 0.021), Neurology (K = 0.400, P = 0.043), and Preventive Medicine (r = 0.577, P = 0.039). Furthermore, a significant agreement was observed between the predicted and actual difficulty indices across all subjects when analyzing the average difficulty of all items (r = 0.144, P = 0.043). Regarding the actual difficulty index by subject, neurology exceeded the desired difficulty range of 0.45–0.75 in 2016. By 2018, however, all subjects fell within this range. Conclusion Repeated item development training, which includes predicting each item’s difficulty index, can enhance faculty members’ ability to predict and adjust item difficulty accurately. To ensure that the difficulty of the examination aligns with its intended purpose, item development training can be beneficial. Further studies on faculty development are necessary to explore these benefits more comprehensively.

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1186/s12909-024-05577-x.pdf

Reference37 articles.

1. Ferris H, O’ Flynn D. Assessment in medical education; what are we trying to achieve? Int J High Educ. 2015;4:139–44.

2. Lee GB, Chiu AM. Assessment and feedback methods in competency-based medical education. Ann Allergy Asthma Immunol. 2022;128:256–62.

3. Boud D. Assessment and learning: contradictory or complementary. In: Knight P, editor. Assessment for Learning in Higher Education. London: Kogan Page; 1995. pp. 35–48.

4. Müller S, Settmacher U, Koch I, Dahmen U. A pilot survey of student perceptions on the benefit of the OSCE and MCQ modalities. GMS J Med Educ. 2018;35:Doc51.

5. Herrero JI, Lucena F, Quiroga J. Randomized study showing the benefit of medical study writing multiple choice questions on their learning. BMC Med Educ. 2019;19:42.