Page 203 -
P. 203

202	 Chapter 5  Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

                              This is the same as the regression model with the continuous regressor Xi
                         except that now the regressor is the binary variable Di. Because Di is not continu-
                         ous, it is not useful to think of b1 as a slope; indeed, because Di can take on only
                         two values, there is no “line,” so it makes no sense to talk about a slope. Thus we
                         will not refer to b1 as the slope in Equation (5.15); instead we will simply refer to b1 as
                         the coefficient multiplying Di in this regression or, more compactly, the coefficient
                         on Di.

                              If b1 in Equation (5.15) is not a slope, what is it? The best way to interpret b0
                         and b1 in a regression with a binary regressor is to consider, one at a time, the two
                         possible cases, Di = 0 and Di = 1. If the student–teacher ratio is high, then
                         Di = 0 and Equation (5.15) becomes

                         	 Yi = b0 + ui (Di = 0).	(5.16)

                         Because E(ui ͉ Di) = 0, the conditional expectation of Yi when Di = 0 is
                         E(Yi ͉ Di = 0) = b0; that is, b0 is the population mean value of test scores when
                         the student–teacher ratio is high. Similarly, when Di = 1,

                         	 Yi = b0 + b1 + ui (Di = 1).	(5.17)

                         Thus, when Di = 1, E(Yi ͉ Di = 1) = b0 + b1; that is, b0 + b1 is the population
                         mean value of test scores when the student–teacher ratio is low.

                              Because b0 + b1 is the population mean of Yi when Di = 1 and b0 is the
                         population mean of Yi when Di = 0, the difference (b0 + b1) - b0 = b1 is the
                         difference between these two means. In other words, b1 is the difference between
                         the conditional expectation of Yi when Di = 1 and when Di = 0, or
                         b1 = E(Yi ͉ Di = 1) - E(Yi ͉ Di = 0). In the test score example, b1 is the differ-
                         ence between mean test score in districts with low student–teacher ratios and the
                         mean test score in districts with high student–teacher ratios.

                              Because b1 is the difference in the population means, it makes sense that the
                         OLS estimator b1 is the difference between the sample averages of Yi in the two
                         groups, and, in fact, this is the case.

                        Hypothesis tests and confidence intervals.  If the two population means are the
                         same, then b1 in Equation (5.15) is zero. Thus the null hypothesis that the two
                         population means are the same can be tested against the alternative hypothesis
                         that they differ by testing the null hypothesis b1 = 0 against the alternative
                         b1 0. This hypothesis can be tested using the procedure outlined in Section 5.1.
                         Specifically, the null hypothesis can be rejected at the 5% level against the two-sided
   198   199   200   201   202   203   204   205   206   207   208