Categorization of continuous covariates and complex regression models – friends or foes in intersectionality research

Author:

Richter Adrian1ORCID,Ulbricht Sabina1ORCID,Brockhaus Sarah2ORCID

Affiliation:

1. Department of Prevention Research and Social Medicine, Institute for Community Medicine, University Medicine Greifswald, Greifswald, Germany

2. Faculty of Computer Science and Mathematics, University of Applied Sciences Munich, Munich, Germany

Abstract

Abstract Objective Studies of intersectionality are increasing to examine health inequalities. Different proposals for examining intersections have recently been published. One approach (1) considers models specified with 1st and all 2nd -order effects and another (2) the stratification based on multiple covariates; both categorize continuous covariates. A simulation study was conducted in order to review both methods with regard to correct identification of intersections, rate of false positive results, and generalizability to independent data compared to an established approach (3) of backward variable elimination according to Bayesian information criterium (BE-BIC). Study design and setting: Two basically different settings were simulated with 1000 replications: (1) comprised the covariates age, sex, body mass index, education, and diabetes in which no association was present between covariates and a continuous response and (2), comprising the same covariates, and a non-linear interaction term of age and sex, i.e., a non-linear increase in females above middle age formed the intersection of interest. The sample size (N = 200 to N = 3000) and signal to noise ratios (SNR, 0.5 to 4) were varied. In each simulated dataset bootstrap with replacement was used to fit the model to internal learning data and to predict outcomes using the fitted models in these data as well as the internal validation data. In both, the mean squared error (MSE) was calculated. Results In simulation setting 1, approaches 1/2 generated spurious effects in more than 90% of simulations across all sample sizes. In smaller sample size, approach 3 (BE-BIC) selected 36.5% the correct model, in larger sample size in 89.8% and always had a lower number of spurious effects. MSE in independent data was generally higher for approaches 1/2 when compared to 3. In simulation setting 2, approach 1 selected most frequently the correct interaction but frequently showed spurious effects (> 75%). Across all sample sizes and SNR, approach 3 generated least often spurious results and had lowest MSE in independent data. Conclusion Categorization of continuous covariates is detrimental to studies on intersectionality. Due to high model complexity such approaches are prone to spurious effects and often lack interpretability. Approach 3 (BE-BIC) is considerably more robust against spurious findings, showed better generalizability to independent data, and can be used with most statistical software. For intersectionality research we consider it more important to describe relevant intersections rather than all possible intersections.

Publisher

Research Square Platform LLC

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3