1. Statistical Foundations of Econometric Modelling
2. Since regression specifications may be used for all sorts of purposes, including description, inference, and prediction, researchers will necessarily have need of many different criteria for choosing among them. A specification chosen to predict best in repeated samples with fixed independent variables will not necessarily be the best choice when the goal is prediction outside the sample or estimation of a particular regression coefficient. R 2 is nearly useless as a criterion for such purposes. Hence the proliferation of “measures of fit” such as Cp , Sp, Akaike's information criterion, etc. (e.g., Maddala 1988, 425–34). Each has its uses. However, none is a parameter estimate, and thus none replaces the SEE.
3. Lewis-Beck and Skalaban note that the R 2 has another use as well: it may be employed to test the hypothesis that all regression coefficients (other than the intercept) are zero. However, a little arithmetic will show that this use of the R 2 is just a computational trick. The only statistic actually needed for the calculation is the SEE, and, in fact, the proof that the test statistic has an F-distribution under the null hypothesis, depends only on properties of the SEE. Moreover, and contrary to Lewis-Beck and Skalaban's suggestion, normality of the variables is not required for the test, but rather just normality of the disturbances.
4. By the usual result that the total variance is the mean of the within-group variances plus the variance of the group means. Both the old within-group variances were 4; hence the mean of these two variances is also 4. Furthermore, the new total mean (= 7) is 4 units distant from each of the old group means (3 and 11), so that the group means have variance 42 = 16. Hence the total sample variance is var(xi ) = 4 + 16 = 20.