Affiliation:
1. University of California, Berkeley, Berkeley, CA, USA
Abstract
In regression analysis with a continuous and positive dependent variable, a multiplicative relationship between the unlogged dependent variable and the independent variables is often specified. It can then be estimated on its unlogged or logged form. The two procedures may yield major differences in estimates, even opposite signs. The reason is that estimation on the unlogged form yields coefficients for the relative arithmetic mean of the unlogged dependent variable, whereas estimation on the logged form gives coefficients for the relative geometric mean for the unlogged dependent variable (or for absolute differences in the arithmetic mean of the logged dependent variable). Estimated coefficients from the two forms may therefore vary widely, because of their different foci, relative arithmetic versus relative geometric means. The first goal of this article is to explain why major divergencies in coefficients can occur. Although well understood in the statistical literature, this is not widely understood in sociological research, and it is hence of significant practical interest. The second goal is to derive conditions under which divergencies will not occur, where estimation on the logged form will give unbiased estimators for relative arithmetic means. First, it derives the necessary and sufficient conditions for when estimation on the logged form will give unbiased estimators for the parameters for the relative arithmetic mean. This requires not only that there is arithmetic mean independence of the unlogged error term but that there is also geometric mean independence. Second, it shows that statistical independence of the error terms on regressors implies that there is both arithmetic and geometric mean independence for the error terms, and it is hence a sufficient condition for absence of bias. Third, it shows that although statistical independence is a sufficient condition, it is not a necessary one for lack of bias. Fourth, it demonstrates that homoskedasticity of error terms is neither a necessary nor a sufficient condition for absence of bias. Fifth, it shows that in the semi-logarithmic specification, for a logged error term with the same qualitative distributional shape at each value of independent variables (e.g., normal), arithmetic mean independence, but heteroskedasticity, estimation on the logged form will give biased estimators for the parameters for the arithmetic mean (whereas with homoskedasticity, and for this case thus statistical independence, estimators are unbiased, from the second result above).
Subject
Sociology and Political Science
Cited by
11 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献