Comparison of Methods Used in Detection of DIF in Cognitive Diagnostic Models with Traditional Methods: Applications in TIMSS 2011-Reference-Cited by-同舟云学术

Comparison of Methods Used in Detection of DIF in Cognitive Diagnostic Models with Traditional Methods: Applications in TIMSS 2011

Published:2023-03-25 Issue:1 Volume:14 Page:76-94
ISSN:1309-6575
Container-title:Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi
language:
Short-container-title:

Author:

EREN Büşra¹^ORCID,GÜNDÜZ Tuba²^ORCID,TAN Şeref³^ORCID

Affiliation:

1. NATIONAL DEFENSE UNIVERSITY

2. MUGLA SITKI KOCMAN UNIVERSITY

3. GAZI UNIVERSITY

Abstract

This study aims to compare the Wald test and likelihood ratio test (LRT) approaches with Classical Test Theory (CTT) and Item Response Theory (IRT) based differential item functioning (DIF) detection methods in the context of cognitive diagnostic models (CDMs), using the TIMSS 2011 dataset as a retrofitting study. CDMs, which have a significant potential when determining the DIF and their contribution to validity, can give confidence under the strong methodological background condition is met. Therefore, it is hoped that this study will contribute to the literature to ensure the correct usage of CDMs and evaluate the compatibility of these new approaches with traditional methods. According to the analysis results, thirty-one items showed differences between the cognitive diagnosis assessments and the traditional methods. The item with the largest DIF was found in the Raju Unsigned Area Measures technique in IRT, whereas the item with the lowest DIF was found in the Wald test technique developed for CDMs. In general, the analyses show that methods not based on CDMs detect more items with DIF, but the Wald test and LRT methods based on CDMs detect fewer items with DIF. This study conducted DIF analyses to determine the test's psychometric properties within the framework of CDMs rather than the source of the bias. Researchers can take the study one step further and make more specific assessments about the items' bias regarding the test structure, test scope, and subgroups. In addition, DIF analyses in this study were carried out using only the gender variable, and researchers can use different variables to conduct studies specific to their purpose.

Publisher

Egitimde ve Psikolojide Olcme ve Degerlendirme Dergisi

Subject

Developmental and Educational Psychology,Education

Reference42 articles.

1. Akbay, L. (2021). Impact of retrofitting and item ordering on DIF. Journal of Measurement and Evaluation in Education and Psychology, 12(2), 212-225. https://doi.org/10.21031/epod.886920

2. Asil, M., & Gelbal, S. (2012). Cross-cultural equivalence of the PISA student questionnaire. Education and Science, 37(166), 236-249. https://eb.ted.org.tr/index.php/EB/article/view/1501

3. Camilli, G., & Shepard, L. A. (1994). Methods for identifying biased test items.Sage.

4. Cokluk, O., Gul, E., & Dogan-Gul, Ç. (2016). Examining differential item functions of different item ordered test forms according to item difficulty levels. Educational Sciences: Theory and Practice, 16(1), 319-330. http://dx.doi.org/10.12738/estp.2016.1.0329

5. de la Torre, J. (2008). An empirically-based method of Q-matrix validation for the DINA model: development and applications. Journal of Educational Measurement, 45, 343–362. https://doi.org/10.1111/j.1745-3984.2008.00069.x

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Optimizing Item Construction in Diagnostic Mathematics Test;TEM Journal;2024-05-28