Power Analysis for the Wald, LR, Score, and Gradient Tests in a Marginal Maximum Likelihood Framework: Applications in IRT-Reference-Cited by-同舟云学术

Power Analysis for the Wald, LR, Score, and Gradient Tests in a Marginal Maximum Likelihood Framework: Applications in IRT

Published:2022-08-27 Issue: Volume: Page:
ISSN:0033-3123
Container-title:Psychometrika
language:en
Short-container-title:Psychometrika

Author:

Zimmer Felix^ORCID,Draxler Clemens,Debelak Rudolf

Abstract

AbstractThe Wald, likelihood ratio, score, and the recently proposed gradient statistics can be used to assess a broad range of hypotheses in item response theory models, for instance, to check the overall model fit or to detect differential item functioning. We introduce new methods for power analysis and sample size planning that can be applied when marginal maximum likelihood estimation is used. This allows the application to a variety of IRT models, which are commonly used in practice, e.g., in large-scale educational assessments. An analytical method utilizes the asymptotic distributions of the statistics under alternative hypotheses. We also provide a sampling-based approach for applications where the analytical approach is computationally infeasible. This can be the case with 20 or more items, since the computational load increases exponentially with the number of items. We performed extensive simulation studies in three practically relevant settings, i.e., testing a Rasch model against a 2PL model, testing for differential item functioning, and testing a partial credit model against a generalized partial credit model. The observed distributions of the test statistics and the power of the tests agreed well with the predictions by the proposed methods in sufficiently large samples. We provide an openly accessible R package that implements the methods for user-supplied hypotheses.

Funder

Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung

Publisher

Springer Science and Business Media LLC

Subject

Applied Mathematics,General Psychology

Link

https://link.springer.com/content/pdf/10.1007/s11336-022-09883-5.pdf

Reference89 articles.

1. Agresti, A. (2002). Categorical data analysis (2nd ed.). Wiley.

2. American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (2014). Standards for educational and psychological testing. American Educational Research Association.

3. Andersen, E. B. (1973). Conditional inference and models for measuring (Vol. 5). Mentalhygiejnisk Forlag.

4. Baker, F. B., & Kim, S.-H. (2004). Item response theory. CRC Press.

5. Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores. Addison-Wesley.

Cited by 15 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Review of Some of the History of Factorial Invariance and Differential Item Functioning;Multivariate Behavioral Research;2024-09-12

2. Factors associated with intrinsic capacity impairment in hospitalized older adults: a latent class analysis;BMC Geriatrics;2024-06-05

3. Health literacy in patients with gout: A latent profile analysis;PLOS ONE;2024-05-09

4. Anxiety Levels in Caregivers of Transitional ICU Patients: A Cross-sectional Survey;2024-04-08

5. Optimizing Maximum Likelihood Estimation in Performance Factor Analysis: A Comparative Study of Estimation Methods;Springer Proceedings in Mathematics & Statistics;2024