Abstract
We describe a multi-factor model of the spread of COVID-19 across the 58 counties of California from March 2020 to June 2023. The model provides estimates of cumulative cases and duration of the epidemic versus 5 independent variables. The independent variables are the following factors: population, population density, family income, Gini coefficient, and land area (size) of each county. The correlation coefficients of these factors are used to reduce the error in our model.
The model produces two linear equations – one for cumulative cases and the other for duration of infection. Cumulative case estimate is highly correlated with population, but the estimate is improved by considering all 5 factors. The duration of infection estimate is improved by considering population and income level. We also find that infection rate varies highly and roughly obeys a normal distribution, suggesting randomness, rather than correlation with one or more of the 5 factors.