Abstract
AbstractThe analysis of questionnaires often involves representing the high-dimensional responses in a low-dimensional space (e.g., PCA, MCA, or t-SNE). However questionnaire data often contains categorical variables and common statistical model assumptions rarely hold. Here we present a non-parametric approach based on Fisher Information which obtains a low-dimensional embedding of a statistical manifold (SM). The SM has deep connections with parametric statistical models and the theory of phase transitions in statistical physics. Firstly we simulate questionnaire responses based on a non-linear SM and validate our method compared to other methods. Secondly we apply our method to two empirical datasets containing largely categorical variables: an anthropological survey of rice farmers in Bali and a cohort study on health inequality in Amsterdam. Compare to previous analysis and known anthropological knowledge we conclude that our method best discriminates between different behaviours, paving the way to dimension reduction as effective as for continuous data.
Funder
EC | EC Seventh Framework Programm | FP7 Information and Communication Technologies
Publisher
Springer Science and Business Media LLC
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献