Page 202 -
P. 202

5.3    Regression When X Is a Binary Variable	 201

                         interval is bn1 + 1.96SE(bn1), and the predicted effect of the change using that esti-
                         mate is 3bn1 + 1.96SE(bn1)4 * ∆x. Thus a 95% confidence interval for the effect of
                         changing x by the amount ∆x can be expressed as

                                                        95% confidence interval for b1∆x
                         	 = 3(bn1 - 1.96SE(bn1))∆x, (bn1 + 1.96SE(bn1))∆x4.	(5.13)

                         For example, our hypothetical superintendent is contemplating reducing the
                         student–teacher ratio by 2. Because the 95% confidence interval for b1 is
                         3 -3.30, -1.264, the effect of reducing the student–teacher ratio by 2 could be as
                         great as -3.30 * ( -2) = 6.60 or as little as -1.26 * ( -2) = 2.52. Thus decreas-
                         ing the student–teacher ratio by 2 is predicted to increase test scores by between
                         2.52 and 6.60 points, with a 95% confidence level.

	 5.3	 Regression When X Is a Binary Variable

                         The discussion so far has focused on the case that the regressor is a continuous
                         variable. Regression analysis can also be used when the regressor is binary—that
                         is, when it takes on only two values, 0 or 1. For example, X might be a worker’s
                         gender ( =1 if female, = 0 if male), whether a school district is urban or rural
                         ( = 1 if urban, = 0 if rural), or whether the district’s class size is small or large
                         ( = 1 if small, = 0 if large). A binary variable is also called an indicator variable
                         or sometimes a dummy variable.

Interpretation of the Regression Coefficients

The mechanics of regression with a binary regressor are the same as if it is con-
tinuous. The interpretation of b1, however, is different, and it turns out that
regression with a binary variable is equivalent to performing a difference of means
analysis, as described in Section 3.4.

     To see this, suppose you have a variable Di that equals either 0 or 1, depend-
ing on whether the student–teacher ratio is less than 20:

	  Di  =   1 if the student9teacher ratio in ith district    6  2200.	  (5.14)
          e 0 if the student9teacher ratio in ith district   Ú

The population regression model with Di as the regressor is

	 Yi = b0 + b1Di + ui, i = 1, c, n.	(5.15)
   197   198   199   200   201   202   203   204   205   206   207