Page 203 -

P. 203

202 Chapter 5 Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

This is the same as the regression model with the continuous regressor Xi
except that now the regressor is the binary variable Di. Because Di is not continu-
ous, it is not useful to think of b1 as a slope; indeed, because Di can take on only
two values, there is no “line,” so it makes no sense to talk about a slope. Thus we
will not refer to b1 as the slope in Equation (5.15); instead we will simply refer to b1 as
the coefficient multiplying Di in this regression or, more compactly, the coefficient
on Di.

If b1 in Equation (5.15) is not a slope, what is it? The best way to interpret b0
and b1 in a regression with a binary regressor is to consider, one at a time, the two
possible cases, Di = 0 and Di = 1. If the student–teacher ratio is high, then
Di = 0 and Equation (5.15) becomes

Yi = b0 + ui (Di = 0). (5.16)

Because E(ui ͉ Di) = 0, the conditional expectation of Yi when Di = 0 is
E(Yi ͉ Di = 0) = b0; that is, b0 is the population mean value of test scores when
the student–teacher ratio is high. Similarly, when Di = 1,

Yi = b0 + b1 + ui (Di = 1). (5.17)

Thus, when Di = 1, E(Yi ͉ Di = 1) = b0 + b1; that is, b0 + b1 is the population
mean value of test scores when the student–teacher ratio is low.

Because b0 + b1 is the population mean of Yi when Di = 1 and b0 is the
population mean of Yi when Di = 0, the difference (b0 + b1) - b0 = b1 is the
difference between these two means. In other words, b1 is the difference between
the conditional expectation of Yi when Di = 1 and when Di = 0, or
b1 = E(Yi ͉ Di = 1) - E(Yi ͉ Di = 0). In the test score example, b1 is the differ-
ence between mean test score in districts with low student–teacher ratios and the
mean test score in districts with high student–teacher ratios.

Because b1 is the difference in the population means, it makes sense that the
OLS estimator b1 is the difference between the sample averages of Yi in the two
groups, and, in fact, this is the case.

Hypothesis tests and confidence intervals. If the two population means are the
same, then b1 in Equation (5.15) is zero. Thus the null hypothesis that the two
population means are the same can be tested against the alternative hypothesis
that they differ by testing the null hypothesis b1 = 0 against the alternative
b1 0. This hypothesis can be tested using the procedure outlined in Section 5.1.
Specifically, the null hypothesis can be rejected at the 5% level against the two-sided

198 199 200 201 202 203 204 205 206 207 208