Page 251 -
P. 251

250	 Chapter 6  Linear Regression with Multiple Regressors

                         Like the previous example, the perfect linear relationship among the regressors
                         involves the constant regressor X0i: For every district, PctESi = 100 - PctELi =
                         100 * X0i - PctELi, because X0i = 1 for all i.

                              This example illustrates another point: Perfect multicollinearity is a feature of the
                         entire set of regressors. If either the intercept (that is, the regressor X0i) or PctELi were
                         excluded from this regression, the regressors would not be perfectly multicollinear.

                        The dummy variable trap.  Another possible source of perfect multicollinearity arises
                         when multiple binary, or dummy, variables are used as regressors. For example, sup-
                         pose you have partitioned the school districts into three categories: rural, suburban,
                         and urban. Each district falls into one (and only one) category. Let these binary vari-
                         ables be Rurali, which equals 1 for a rural district and equals 0 otherwise; Suburbani;
                         and Urbani. If you include all three binary variables in the regression along with a
                         constant, the regressors will be perfect multicollinearity: Because each district belongs
                         to one and only one category, Rurali +Suburbani + Urbani = 1 = X0i, where X0i
                         denotes the constant regressor introduced in Equation (6.6). Thus, to estimate the
                         regression, you must exclude one of these four variables, either one of the binary
                         indicators or the constant term. By convention, the constant term is retained, in which
                         case one of the binary indicators is excluded. For example, if Rurali were excluded,
                         then the coefficient on Suburbani would be the average difference between test scores
                         in suburban and rural districts, holding constant the other variables in the regression.

                              In general, if there are G binary variables, if each observation falls into one
                         and only one category, if there is an intercept in the regression, and if all G binary
                         variables are included as regressors, then the regression will fail because of perfect
                         multicollinearity. This situation is called the dummy variable trap. The usual way
                         to avoid the dummy variable trap is to exclude one of the binary variables from
                         the multiple regression, so only G - 1 of the G binary variables are included as
                         regressors. In this case, the coefficients on the included binary variables represent
                         the incremental effect of being in that category, relative to the base case of the
                         omitted category, holding constant the other regressors. Alternatively, all G
                         binary regressors can be included if the intercept is omitted from the regression.

                        Solutions to perfect multicollinearity.  Perfect multicollinearity typically arises
                         when a mistake has been made in specifying the regression. Sometimes the mis-
                         take is easy to spot (as in the first example) but sometimes it is not (as in the
                         second example). In one way or another, your software will let you know if you
                         make such a mistake because it cannot compute the OLS estimator if you have.

                              When your software lets you know that you have perfect multicollinearity, it
                         is important that you modify your regression to eliminate it. Some software is
   246   247   248   249   250   251   252   253   254   255   256