Affiliation:
1. Department of Educational Psychology, University of Illinois Chicago, Chicago, IL 60607, USA
2. Department of Behavioral Health and Nutrition, University of Delaware, Newark, DE 19716, USA
Abstract
Dichotomous data correspond with various types of commonly encountered data, e.g., positive/negative, case/control, missing/observed, in many fields, including medicine, health, and social sciences. Despite their ubiquity, criteria for determining dimensionality from dichotomous variables are not yet established. We conducted a large-scale simulation (Study 1) to evaluate four criteria—Kaiser, empirical Kaiser, parallel analysis, and profile likelihood—to determine the dimensionality of dichotomous data across combinations of correlation matrices (Pearson r or tetrachoric ρ) and analysis methods (principal component analysis or exploratory factor analysis), and combinations of study characteristics: sample sizes (100, 250, and 1000), variable splits (10%/90%, 25%/75%, and 50%/50%), dimensions (1, 3, 5, and 10), and items per dimension (3, 5, and 10) with 1000 replications per condition. Parallel analysis performed best, recovering dimensionality in 87.9% of replications when using principal component analysis with Pearson correlations. Guidance for selecting criteria is provided. In Study 2, we applied this dimensionality reduction approach to two different longitudinal data sets where missing data posed difficulty for multivariate data analysis. The applications of this approach to longitudinal data suggest that the exploration of resulting missing data meta-patterns is useful in practice.
Funder
National Cancer Institute of the National Institutes of Health
Subject
General Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)