Affiliation:
1. Oncology Statistical Innovation AstraZeneca Gaithersburg Maryland USA
Abstract
ABSTRACTPreclinical studies are broad and can encompass cellular research, animal trials, and small human trials. Preclinical studies tend to be exploratory and have smaller datasets that often consist of biomarker data. Logistic regression is typically the model of choice for modeling a binary outcome with explanatory variables such as genetic, imaging, and clinical data. Small preclinical studies can have challenging data that may include a complete separation or quasi‐complete separation issue that will result in logistic regression inflated coefficient estimates and standard errors. Penalized regression approaches such as Firth's logistic regression are a solution to reduce the bias in the estimates. In this tutorial, a number of examples with separation (complete or quasi‐complete) are illustrated and the results from both logistic regression and Firth's logistic regression are compared to demonstrate the inflated estimates from the standard logistic regression model and bias‐reduction of the estimates from the penalized Firth's approach. R code and datasets are provided in the supplement.