Page 472 -
P. 472

12.1    The IV Estimator with a Single Regressor and a Single Instrument	 471

	 12.1	 The IV Estimator with a Single Regressor

               and a Single Instrument

                         We start with the case of a single regressor, X, which might be correlated with the
                         regression error, u. If X and u are correlated, the OLS estimator is inconsistent;
                         that is, it may not be close to the true value of the regression coefficient even when
                         the sample is very large [see Equation (6.1)]. As discussed in Section 9.2, this cor-
                         relation between X and u can stem from various sources, including omitted vari-
                         ables, errors in variables (measurement errors in the regressors), and simultaneous
                         causality (when causality runs “backward” from Y to X as well as “forward” from
                         X to Y). Whatever the source of the correlation between X and u, if there is a
                         valid instrumental variable, Z, the effect on Y of a unit change in X can be esti-
                         mated using the instrumental variables estimator.

                   The IV Model and Assumptions

                         The population regression model relating the dependent variable Yi and regressor
                         Xi is

                         	 Yi = b0 + b1Xi + ui, i = 1, c, n,	(12.1)

                         where as usual ui is the error term representing omitted factors that determine Yi.
                         If Xi and ui are correlated, the OLS estimator is inconsistent. Instrumental vari-
                         ables estimation uses an additional, “instrumental” variable Z to isolate that part
                         of X that is uncorrelated with ui.

                        Endogeneity and exogeneity.  Instrumental variables regression has some special-
                         ized terminology to distinguish variables that are correlated with the population
                         error term u from ones that are not. Variables correlated with the error term are
                         called endogenous variables, while variables uncorrelated with the error term are
                         called exogenous variables. The historical source of these terms traces to models
                         with multiple equations, in which an “endogenous” variable is determined within
                         the model while an “exogenous” variable is determined outside the model. For
                         example, Section 9.2 considered the possibility that if low test scores produced
                         decreases in the student–teacher ratio because of political intervention and
                         increased funding, causality would run both from the student–teacher ratio to test
                         scores and from test scores to the student–teacher ratio. This was represented math-
                         ematically as a system of two simultaneous equations [Equations (9.3) and (9.4)],
   467   468   469   470   471   472   473   474   475   476   477