Page 683 -
P. 683
682 Chapter 15 Estimation of Dynamic Causal Effects
implied by f(L)b(L) = d(L). Thus the estimator of the dynamic multipliers based on the
OLS estimators of the coefficients of the ADL model, dn(L) and fn(L), is
bnADL(L) = fn(L)-1dn(L) . (15.45)
The expressions for the coefficients in Equation (15.29) in the text are obtained as a special
case of Equation (15.45) when r = 1 and p = 1.
The feasible GLS estimator is computed by obtaining a preliminary estimator of f(L),
computing estimated quasi-differences, estimating b(L) in Equation (15.44) using these
estimated quasi-differences, and (if desired) iterating until convergence. The iterated GLS
estimator is the NLLS estimator computed by NLLS estimation of the ADL model in
Equation (15.42), subject to the nonlinear restrictions on the parameters contained in
Equation (15.43).
As stressed in the discussion surrounding Equation (15.36) in the text, it is not enough
for Xt to be (past and present) exogenous to use either of these estimation methods, for
exogeneity alone does not ensure that Equation (15.36) holds. If, however, X is strictly
exogenous, then Equation (15.36) does hold, and assuming that Assumptions 2 through 4
of Key Concept 14.6 hold, these estimators are consistent and asymptotically normal.
Moreover, the usual (cross-sectional heteroskedasticity-robust) OLS standard errors pro-
vide a valid basis for statistical inference.
Parameter reduction using the ADL model. Suppose that the distributed lag polynomial
b(L) can be written as a ratio of lag polynomials, u2(L)-1u1(L), where u1(L) and u2(L) are
both lag polynomials of a low degree. Then f(L)b(L) in Equation (15.43) is
f(L)b(L) = f(L)3u2(L)-1u1(L)4 = 3f(L)u2(L)-14u1(L). If it so happens that f(L) = u2(L),
then d(L) = f(L)b(L) = u1(L). If the degree of u1(L) is low, then q, the number of lags of
Xt in the ADL model, can be much less than r. Thus, under these assumptions, estimation
of the ADL model entails estimating potentially many fewer parameters than the original
distributed lag model. It is in this sense that the ADL model can achieve more parsimoni-
ous parameterizations (that is, use fewer unknown parameters) than the distributed lag
model.
As developed here, the assumption that f(L) and u2(L) happen to be the same seems
like a coincidence that would not occur in an application. However, the ADL model is able
to capture a large number of shapes of dynamic multipliers with only a few coefficients.
ADL or GLS: Bias versus variance. A good way to think about whether to estimate
dynamic multipliers by first estimating an ADL model and then computing the dynamic
multipliers from the ADL coefficients or, alternatively, by estimating the distributed lag
model directly using GLS is to view the decision in terms of a trade-off between bias
and variance. Estimating the dynamic multipliers using an approximate ADL model

