1. American Educational Research Association, American Psychological Association, & National Council for Measurement in Education (1999). Standards for educational and psychological testing. Washington: American Educational Research Association.
2. Bock, R.D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: application of an em algorithm. Psychometrika, 46, 443–459.
3. Bock, R.D., & Haberman, S.J. (2009). Confidence bands for examining goodness-of-fit of estimated item response functions. Paper presented at the annual meeting of the Psychometric Society. Cambridge, UK.
4. Box, G.E.P., & Draper, N.R. (1987). Empirical model-building and response surfaces. New York: Wiley.
5. Chon, K.H., Lee, W., & Dunbar, S.B. (2010). A comparison of item fit statistics for mixed IRT models. Journal of Educational Measurement, 47, 318–338.