Page 209 -
P. 209

208	 Chapter 5  Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

The Economic Value of a Year of Education:
Homoskedasticity or Heteroskedasticity?

On average, workers with more education have                average, hourly earnings increase by $1.93 for each
       higher earnings than workers with less educa-        additional year of education. The 95% confidence
tion. But if the best-paying jobs mainly go to the col-     interval for this coefficient is 1.93 { 1.96 * 0.08, or
lege educated, it might also be that the spread of the      1.77 to 2.09.
distribution of earnings is greater for workers with
more education. Does the distribution of earnings               The second striking feature of Figure 5.3 is that
spread out as education increases?                          the spread of the distribution of earnings increases
                                                            with the years of education. While some workers
    This is an empirical question, so answering it          with many years of education have low-paying jobs,
requires analyzing data. Figure 5.3 is a scatterplot of     very few workers with low levels of education have
the hourly earnings and the number of years of edu-         high-paying jobs. This can be quantified by looking
cation for a sample of 2829 full-time workers in the        at the spread of the residuals around the OLS regres-
United States in 2012, ages 29 and 30, with between         sion line. For workers with ten years of education,
6 and 18 years of education. The data come from             the standard deviation of the residuals is $4.32; for
the March 2013 Current Population Survey, which             workers with a high school diploma, this standard
is described in Appendix 3.1.                               deviation is $7.80; and for workers with a college
                                                            degree, this standard deviation increases to $12.46.
    Figure 5.3 has two striking features. The first is      Because these standard deviations differ for differ-
that the mean of the distribution of earnings increases     ent levels of education, the variance of the residuals
with the number of years of education. This increase        in the regression of Equation (5.23) depends on the
is summarized by the OLS regression line,                   value of the regressor (the years of education); in
                                                            other words, the regression errors are heteroskedas-
     Earnings = -7.29 + 1.93Years Education,                tic. In real-world terms, not all college graduates will
	 (1.10)  (0.08)                                            be earning $50 per hour by the time they are 29, but
	 R2 = 0.162, SER = 10.29.	(5.23)                           some will, and workers with only ten years of educa-
                                                            tion have no shot at those jobs.
This line is plotted in Figure 5.3. The coefficient
of 1.93 in the OLS regression line means that, on

Figure 5.3 	Scatterplot of Hourly Earnings and Years of Education
                  for 29- to 30-Year-Olds in the United States in 2012

Hourly earnings are plotted against years of education for  Average hourly earnings  150 ahe  Fitted values
2,829 full-time 29- to 30-year-old workers. The spread                               100
around the regression line increases with the years of                                50
education, indicating that the regression errors are
heteroskedastic.

                                                                                     0
                                                                                       5 10 15 20
                                                                                                   Years of education
   204   205   206   207   208   209   210   211   212   213   214