Page 752 -
P. 752
C h a p t e r The Theory
18 of Multiple Regression
T his chapter provides an introduction to the theory of multiple regression analy-
sis. The chapter has four objectives. The first is to present the multiple regression
model in matrix form, which leads to compact formulas for the OLS estimator and
test statistics. The second objective is to characterize the sampling distribution of the
OLS estimator, both in large samples (using asymptotic theory) and in small samples
(if the errors are homoskedastic and normally distributed). The third objective is to
study the theory of efficient estimation of the coefficients of the multiple regression
model and to describe generalized least squares (GLS), a method for estimating the
regression coefficients efficiently when the errors are heteroskedastic and/or corre-
lated across observations. The fourth objective is to provide a concise treatment of
the asymptotic distribution theory of instrumental variables (IV) regression in the
linear model, including an introduction to generalized method of moments (GMM)
estimation in the linear IV regression model with heteroskedastic errors.
The chapter begins by laying out the multiple regression model and the OLS
estimator in matrix form in Section 18.1. This section also presents the extended
least squares assumptions for the multiple regression model. The first four of these
assumptions are the same as the least squares assumptions of Key Concept 6.4 and
underlie the asymptotic distributions used to justify the procedures described in
Chapters 6 and 7. The remaining two extended least squares assumptions are
stronger and permit us to explore in more detail the theoretical properties of the
OLS estimator in the multiple regression model.
The next three sections examine the sampling distribution of the OLS estimator
and test statistics. Section 18.2 presents the asymptotic distributions of the OLS
e stimator and t-statistic under the least squares assumptions of Key Concept 6.4.
Section 18.3 unifies and generalizes the tests of hypotheses involving multiple coef-
ficients presented in Sections 7.2 and 7.3, and provides the asymptotic distribution of
the resulting F-statistic. In Section 18.4, we examine the exact sampling distributions
of the OLS estimator and test statistics in the special case that the errors are homo-
skedastic and normally distributed. Although the assumption of homoskedastic
normal errors is implausible in most econometric applications, the exact sampling
distributions are of theoretical interest, and p-values computed using these distri-
butions often appear in the output of regression software.
751

