Page 204 -

P. 204

5.4 Heteroskedasticity and Homoskedasticity 203

alternative when the OLS t-statistic t = bn1 > SE(bn1) exceeds 1.96 in absolute value.
Similarly, a 95% confidence interval for b1, constructed as bn1 { 1.96SE(bn1). as
described in Section 5.2, provides a 95% confidence interval for the difference
between the two population means.

Application to test scores. As an example, a regression of the test score against
the student–teacher ratio binary variable D defined in Equation (5.14) estimated
by OLS using the 420 observations in Figure 4.2 yields

TestScore = 650.0 + 7.4D, R2 = 0.037, SER = 18.7,
(1.3) (1.8) (5.18)

where the standard errors of the OLS estimates of the coefficients b0 and b1 are
given in parentheses below the OLS estimates. Thus the average test score for the
subsample with student–teacher ratios greater than or equal to 20 (that is, for
which D = 0) is 650.0, and the average test score for the subsample with student–
teacher ratios less than 20 (so D = 1) is 650.0 + 7.4 = 657.4. The difference
between the sample average test scores for the two groups is 7.4. This is the OLS
estimate of b1, the coefficient on the student–teacher ratio binary variable D.

Is the difference in the population mean test scores in the two groups statisti-
cally significantly different from zero at the 5% level? To find out, construct the
t-statistic on b1: t = 7.4 > 1.8 = 4.04. This value exceeds 1.96 in absolute value, so
the hypothesis that the population mean test scores in districts with high and low
student–teacher ratios is the same can be rejected at the 5% significance level.

The OLS estimator and its standard error can be used to construct a 95% con-
fidence interval for the true difference in means. This is 7.4 { 1.96 *
1.8 = (3.9, 10.9). This confidence interval excludes b1 = 0, so that (as we know
from the previous paragraph) the hypothesis b1 = 0 can be rejected at the 5%
significance level.

5.4 Heteroskedasticity and Homoskedasticity

Our only assumption about the distribution of ui conditional on Xi is that it has a
mean of zero (the first least squares assumption). If, furthermore, the variance of
this conditional distribution does not depend on Xi, then the errors are said to be
homoskedastic. This section discusses homoskedasticity, its theoretical implica-
tions, the simplified formulas for the standard errors of the OLS estimators that
arise if the errors are homoskedastic, and the risks you run if you use these simpli-
fied formulas in practice.

199 200 201 202 203 204 205 206 207 208 209