Page 112 -
P. 112

Chapter

3 Review of Statistics

       Statistics is the science of using data to learn about the world around us. Statisti-
                           cal tools help us answer questions about unknown characteristics of distribu-
                      tions in populations of interest. For example, what is the mean of the distribution of
                      earnings of recent college graduates? Do mean earnings differ for men and women,
                      and, if so, by how much?

                            These questions relate to the distribution of earnings in the population of
                      workers. One way to answer these questions would be to perform an exhaustive
                      survey of the population of workers, measuring the earnings of each worker and
                      thus finding the population distribution of earnings. In practice, however, such a
                      comprehensive survey would be extremely expensive. The only comprehensive sur-
                      vey of the U.S. population is the decennial census, which cost $13 billion to carry
                      out in 2010. The process of designing the census forms, managing and conducting
                      the surveys, and compiling and analyzing the data takes ten years. Despite this
                      extraordinary commitment, many members of the population slip through the
                      cracks and are not surveyed. Thus a different, more practical approach is needed.

                            The key insight of statistics is that one can learn about a population distribution
                      by selecting a random sample from that population. Rather than survey the entire
                      U.S. population, we might survey, say, 1000 members of the population, selected at
                      random by simple random sampling. Using statistical methods, we can use this
                      sample to reach tentative conclusions—to draw statistical inferences—about char-
                      acteristics of the full population.

                            Three types of statistical methods are used throughout econometrics: estima-
                      tion, hypothesis testing, and confidence intervals. Estimation entails computing a
                      “best guess” numerical value for an unknown characteristic of a population distri-
                      bution, such as its mean, from a sample of data. Hypothesis testing entails formulat-
                      ing a specific hypothesis about the population, then using sample evidence to
                      decide whether it is true. Confidence intervals use a set of data to estimate an inter-
                      val or range for an unknown population characteristic. Sections 3.1, 3.2, and 3.3
                      review estimation, hypothesis testing, and confidence intervals in the context of
                      statistical inference about an unknown population mean.

                            Most of the interesting questions in economics involve relationships between
                      two or more variables or comparisons between different populations. For example,

                                                                                                                                         111
   107   108   109   110   111   112   113   114   115   116   117