Abstract
Abstract
Background
The introduction of competency-based education models, student centers, and the increased use of formative assessments have led to demands for high-quality test items to be used in assessments. This study aimed to assess the use of an AI tool to generate MCQs type A and evaluate its quality.
Methods
The study design was cross-sectional analytics conducted from June 2023 to August 2023. This study utilized formative TBL. The AI tool (ChatPdf.com) was selected to generate MCQs type A. The generated items were evaluated using a questionnaire for subject experts and an item (psychometric) analysis. The questionnaire to the subject experts about items was formed based on item quality and rating of item difficulty.
Results
The total number of recurrent staff members as experts was 25, and the questionnaire response rate was 68%. The quality of the items ranged from good to excellent. None of the items had scenarios or vignettes and were direct. According to the expert’s rating, easy items represented 80%, and only two had moderate difficulty (20%). Only one item out of the two moderate difficulties had the same difficulty index. The total number of students participating in TBL was 48. The mean mark was 4.8 ± 1.7 out of 10. The KR20 is 0.68. Most items were of moderately difficult (90%) and only one was difficult (10%). The discrimination index of the items ranged from 0.77 to 0.15. Items with excellent discrimination represented 50% (5), items with good discrimination were 3 (30%), and only one time was poor (10%), and one was none discriminating. The non-functional distractors were 26 (86.7%), and the number of non-functional distractors was four (13.3%). According to distractor analysis, 60% of the items were excellent, and 40% were good. A significant correlation (p = 0.4, r = 0.30) was found between the difficulty and discrimination indices.
Conclusion
Items constructed using AI had good psychometric properties and quality, measuring higher-order domains. AI allows the construction of many items within a short time. We hope this paper brings the use of AI in item generation and the associated challenges into a multi-layered discussion that will eventually lead to improvements in item generation and assessment in general.
Publisher
Springer Science and Business Media LLC
Reference61 articles.
1. Pugh D, De Champlain A, Gierl M, Lai H, Touchie C. Can automated item generation be used to develop high quality MCQs that assess application of knowledge? Res Pract Technol Enhanced Learn. 2020;15:1–13.
2. Naidoo M. The pearls and pitfalls of setting high-quality multiple choice questions for clinical medicine. South Afr Family Practice: Official J South Afr Acad Family Practice/Primary Care. 2023;65(1):e1–e4.
3. Scott KR, King AM, Estes MK, Conlon LW, Jones JS, Phillips AW. Evaluation of an intervention to improve quality of single-best answer multiple-choice questions. West J Emerg Med. 2019;20(1):11–4.
4. Rahim MF, Bham SQ, Khan S, Ansari T, Ahmed M. Improving the quality of MCQs by enhancing cognitive level and using psychometric analysis: improving the quality of MCQs by enhancing cognitive level. Pakistan J Health Sci 2023:115–21.
5. Licona-Chávez AL, Velázquez-Liaño LR. Quality assessment of a multiple choice test through psychometric properties. MedEdPublish. 2020;9(91):1–12.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献