Author:
Wendling Alexandre,Galiez Clovis
Abstract
ABSTRACTThe analysis of categorical data, particularly the study of associations between binary outcomes and binary features, is crucial across various scientific disciplines, such as assessing the impact of vaccination on health outcomes. Traditional 2×2 contingency tables are commonly used to summarize binary counts; however, these analyses can be confounded by external factors like age or gender, necessitating stratification to create sub-tables. Stratified analysis is prevalent in medical, epidemiological, and social research, as well as in meta-analyses. Current methodologies for testing associations across strata struggle with small sample sizes and heterogeneity of the effect among strata. To cope with these limitations, exact tests can be used, but at a very high computational cost, preventing their use in most situations. Here, we propose the Gamma Approximation of Stratified Truncated Exact (GASTE) test as a robust alternative. The core of this paper presents a method for approximating the exact statistic of combination ofp-values with discrete support, leveraging the gamma distribution to approximate the distribution of the test statistic under stratification. We show that this approximation maintains a high test power while keeping a low level of type I error. The GASTE method provides fast and accuratep-value calculations even in the presence of homogeneous and heterogeneous effects between strata, and is robust in scenarios with varying levels of significance. Our findings demonstrate that the GASTE test outperforms traditional methods, offering more sensitive and reliable detections. This advancement not only enhances the robustness of stratified analyses, but thanks to its fast computation also broadens the applicability of exact tests in various research fields. Firstly, we illustrate our method through the ecological application that motivated its development, consisting of the study of Alpine plant associations. Secondly, we apply our method to a well-known case study of stratified binary data, concerning admissions to the University of California at Berkeley in 1973. Overall, the GASTE method is a powerful and flexible tool for researchers dealing with stratified binary data, offering substantial improvements over traditional methods such as the CMH (Cochran-Mantel-Haenszel) test. An open-source python package is provided athttps://github.com/AlexandreWen/gaste.
Publisher
Cold Spring Harbor Laboratory
Reference35 articles.
1. Agresti, A. (2012). Categorical data analysis, volume 792. John Wiley & Sons.
2. The analysis of stratified 2 2 contingency tables;Biometrical Journal: Journal of Mathematical Methods in Biosciences,2006
3. Sex Bias in Graduate Admissions: Data from Berkeley
4. Teoria statistica delle classi e calcolo delle probabilita;Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commericiali di Firenze,1936
5. A basic introduction to fixed-effect and random-effects models for meta-analysis;Research Synthesis Methods,2010