Page 634 -
P. 634
Consistency of the BIC Lag Length Estimator 633
As a theoretical matter, the families of AR, MA, and ARMA models are equally rich,
as long as the lag polynomials have a sufficiently high degree. Still, in some cases the auto-
covariances can be better approximated using an ARMA(p, q) model with small p and q
than by a pure AR model with only a few lags. As a practical matter, however, the estima-
tion of ARMA models is more difficult than the estimation of AR models, and ARMA
models are more difficult to extend to additional regressors than are AR models.
A p p e n d i x
14.5 Consistency of the BIC Lag Length Estimator
This appendix summarizes the argument that the BIC estimator of the lag length, pn, in an
autoregression is correct in large samples; that is, Pr( pn = p) S 1. This is not true for the
AIC estimator, which can overestimate p even in large samples.
BIC
First consider the special case that the BIC is used to choose among autoregressions with
zero, one, or two lags, when the true lag length is one. It is shown below that (i)
Pr(pn = 0) S 0 and (ii) Pr(pn = 2) S 0, from which it follows that Pr(pn = 1) S 1. The
extension of this argument to the general case of searching over 0 … p … pmax entails
showing that Pr(pn 6 p) S 0 and Pr(pn 7 p) S 0; the strategy for showing these is the same
as used in (i) and (ii) below.
Proof of (i) and (ii)
Proof of (i). To choose pn = 0 it must be the case that BIC(0) 6 BIC(1); that
i s , BIC(0) - BIC(1) 6 0 . N o w BIC(0) - BIC(1) = 3ln(SSR(0)>T ) + (lnT ) > T4 -
3ln(SSR(1)>T )4 + 2(lnT )>T4 = ln(SSR(0)>T ) - ln(SSR(1)>T ) - (ln T )>T . N o w
SSR(0)>T = 3(T - 1)>T4sY2 ¡p sY2 , SSR(1)>T ¡p su2, and (ln T ) > T ¡ 0; putting
these pieces together, BIC(0) - BIC(1) ¡p lnsY2 - lnsu2 7 0 because s2Y 7 su2.
It follows that Pr3BIC(0) 6 BIC(1)4 S 0, so Pr(pn = 0) ¡ 0.
Proof of (ii). To choose pn = 2, it must be the case that BIC(2) 6 BIC(1) or
BIC(2) - BIC(1) 6 0 . N o w T3BIC(2) - BIC(1)4 = T53ln(SSR(2)>T ) + 3(lnT )>T]
- 3ln(SSR(1)> T ) + 2(lnT )>T46 = T ln3SSR(2)>SSR(1)4 + lnT = -T ln31 + F>(T - 2)]
+ lnT, where F = 3SSR(1) - SSR(2)4 > 3SSR(2) > (T - 2)4 is the homoskedasticity-only
F-statistic [Equation (7.13)] testing the null hypothesis that b2 = 0 in the AR(2). If ut is

