Page 112 -
P. 112
Chapter
3 Review of Statistics
Statistics is the science of using data to learn about the world around us. Statisti-
cal tools help us answer questions about unknown characteristics of distribu-
tions in populations of interest. For example, what is the mean of the distribution of
earnings of recent college graduates? Do mean earnings differ for men and women,
and, if so, by how much?
These questions relate to the distribution of earnings in the population of
workers. One way to answer these questions would be to perform an exhaustive
survey of the population of workers, measuring the earnings of each worker and
thus finding the population distribution of earnings. In practice, however, such a
comprehensive survey would be extremely expensive. The only comprehensive sur-
vey of the U.S. population is the decennial census, which cost $13 billion to carry
out in 2010. The process of designing the census forms, managing and conducting
the surveys, and compiling and analyzing the data takes ten years. Despite this
extraordinary commitment, many members of the population slip through the
cracks and are not surveyed. Thus a different, more practical approach is needed.
The key insight of statistics is that one can learn about a population distribution
by selecting a random sample from that population. Rather than survey the entire
U.S. population, we might survey, say, 1000 members of the population, selected at
random by simple random sampling. Using statistical methods, we can use this
sample to reach tentative conclusions—to draw statistical inferences—about char-
acteristics of the full population.
Three types of statistical methods are used throughout econometrics: estima-
tion, hypothesis testing, and confidence intervals. Estimation entails computing a
“best guess” numerical value for an unknown characteristic of a population distri-
bution, such as its mean, from a sample of data. Hypothesis testing entails formulat-
ing a specific hypothesis about the population, then using sample evidence to
decide whether it is true. Confidence intervals use a set of data to estimate an inter-
val or range for an unknown population characteristic. Sections 3.1, 3.2, and 3.3
review estimation, hypothesis testing, and confidence intervals in the context of
statistical inference about an unknown population mean.
Most of the interesting questions in economics involve relationships between
two or more variables or comparisons between different populations. For example,
111

