1. Abedi, J., Baker, E.L., & Herl, H. (1995). Comparing reliability indices obtained by different approaches for performance assessments (CSE Report 401). Los Angeles: University of California, National Center for Research on Evaluation, Standards, and Student Testing (CRESST). https://cresst.org/wp-content/uploads/TECH401.pdf
2. Agresti, A. (2013). Categorical data analysis (3rd ed.). John Wiley & Sons.
3. Airasian, P.W. (1994). Classroom assessment (2nd ed.). McGraw-Hill.
4. Aktaş, M. & Alıcı, D. (2017). Kontrol listesi, analitik rubrik ve dereceleme ölçeklerinde puanlayıcı güvenirliğinin genellenebilirlik kuramına göre incelenmesi [Examination of scoring reliability according to generalizability theory in checklist, analytic rubric, and rating scales]. International Journal of Eurasia Social Sciences, 8(29), 991-1010.
5. Anadol, H.Ö., & Doğan, C.D. (2018). Dereceli puanlama anahtarlarının güvenirliğinin farklı deneyim yıllarına sahip puanlayıcıların kullanıldığı durumlarda i̇ncelenmesi [The examination of realiability of scoring rubrics regarding raters with different experience years]. İlköğretim Online, 1066-1076. https://doi.org/10.17051/ilkonline.2018.419355