Page 472 -

P. 472

12.1 The IV Estimator with a Single Regressor and a Single Instrument 471

12.1 The IV Estimator with a Single Regressor

and a Single Instrument

We start with the case of a single regressor, X, which might be correlated with the
regression error, u. If X and u are correlated, the OLS estimator is inconsistent;
that is, it may not be close to the true value of the regression coefficient even when
the sample is very large [see Equation (6.1)]. As discussed in Section 9.2, this cor-
relation between X and u can stem from various sources, including omitted vari-
ables, errors in variables (measurement errors in the regressors), and simultaneous
causality (when causality runs “backward” from Y to X as well as “forward” from
X to Y). Whatever the source of the correlation between X and u, if there is a
valid instrumental variable, Z, the effect on Y of a unit change in X can be esti-
mated using the instrumental variables estimator.

The IV Model and Assumptions

The population regression model relating the dependent variable Yi and regressor
Xi is

Yi = b0 + b1Xi + ui, i = 1, c, n, (12.1)

where as usual ui is the error term representing omitted factors that determine Yi.
If Xi and ui are correlated, the OLS estimator is inconsistent. Instrumental vari-
ables estimation uses an additional, “instrumental” variable Z to isolate that part
of X that is uncorrelated with ui.

Endogeneity and exogeneity. Instrumental variables regression has some special-
ized terminology to distinguish variables that are correlated with the population
error term u from ones that are not. Variables correlated with the error term are
called endogenous variables, while variables uncorrelated with the error term are
called exogenous variables. The historical source of these terms traces to models
with multiple equations, in which an “endogenous” variable is determined within
the model while an “exogenous” variable is determined outside the model. For
example, Section 9.2 considered the possibility that if low test scores produced
decreases in the student–teacher ratio because of political intervention and
increased funding, causality would run both from the student–teacher ratio to test
scores and from test scores to the student–teacher ratio. This was represented math-
ematically as a system of two simultaneous equations [Equations (9.3) and (9.4)],

467 468 469 470 471 472 473 474 475 476 477