Page 751 -
P. 751
750 Chapter 17 The Theory of Linear Regression with One Regressor
E(W2) = ∞ w2f(w)dw
L-∞
-d d ∞
= w2f(w)dw + w2f(w)dw + w2f(w)dw
L-∞ L-d Ld
-d ∞
Ú w2f(w)dw + w2f(w)dw
L-∞ Ld
-d ∞
Ú d2 c f(w)dw +
L-∞ Ld f(w)dw d
= d2Pr( ͉ W ͉ Ú d), (17.43)
where the first equality is the definition of E(W2), the second equality holds because the
ranges of integration divides up the real line, the first inequality holds because the term
that was dropped is nonnegative, the second inequality holds because w2 Ú d2 over the
range of integration, and the final equality holds by the definition of Pr( ͉ W ͉ Ú d). Substi-
tuting W = V - mv into the final expression, noting that E(W2) = E3(V - mV)24 = var(V),
and rearranging yields the inequality given in Equation (17.42). If V is discrete, this proof
applies with summations replacing integrals.
The Cauchy–Schwarz Inequality
The Cauchy–Schwarz inequality is an extension of the correlation inequality, ͉ rXY ͉ … 1,
to incorporate nonzero means. The Cauchy–Schwarz inequality is
͉ E(XY) ͉ … 2E(X2)E(Y2) (Cauchy9Schwarz inequality). (17.44)
The proof of Equation (17.44) is similar to the proof of the correlation inequality in
Appendix 2.1. Let W = Y + bX, where b is a constant. Then E(W2) = E(Y2) + 2bE(XY) +
b2E(X2). Now let b = - E(XY)>E(X2) so that (after simplification) the expression
becomes E(W2) = E(Y2) - 3E(XY)42>E(X2). Because E(W2) Ú 0 (since W2 Ú 0), it must
be the case that 3E(XY)42 … E(X2)E(Y2), and the Cauchy–Schwarz inequality follows by
taking the square root.

