Affiliation:
1. School of Psychology Shanghai Normal University
2. Lab for Educational Big Data and Policymaking
3. Department of Educational Psychology The Chinese University of Hong Kong
Abstract
AbstractShort scales are time‐efficient for participants and cost‐effective in research. However, researchers often mistakenly expect short scales to have the same reliability as long ones without considering the effect of scale length. We argue that applying a universal benchmark for alpha is problematic as the impact of low‐quality items is greater on shorter scales. In this study, we proposed simple guidelines for item reduction using the “alpha‐if‐item‐deleted” procedure in scale construction. An item can be removed if alpha increases or decreases by less than .02, especially for short scales. Conversely, an item should be retained if alpha decreases by more than .04 upon its removal. For reliability benchmarks, .80 is relatively safe in most conditions, but higher benchmarks are recommended for longer scales and smaller sample sizes. Supplementary analyses, including item content, face validity, and content coverage, are critical to ensure scale quality.