Page 161 -
P. 161

160	 Chapter 4  Linear Regression with One Regressor

                         population regression line, so the error term for that district, u1, is positive. In
                         contrast, Y2 is below the population regression line, so test scores for that district
                         were worse than predicted, and u2 6 0.

                              Now return to your problem as advisor to the superintendent: What is the
                         expected effect on test scores of reducing the student–teacher ratio by two students
                         per teacher? The answer is easy: The expected change is ( - 2) * bClassSize.
                         But what is the value of bClassSize?

	 4.2	 Estimating the Coefficients

               of the Linear Regression Model

                         In a practical situation such as the application to class size and test scores, the
                         intercept b0 and slope b1 of the population regression line are unknown. There-
                         fore, we must use data to estimate the unknown slope and intercept of the popu-
                         lation regression line.

                              This estimation problem is similar to others you have faced in statistics. For
                         example, suppose you want to compare the mean earnings of men and women
                         who recently graduated from college. Although the population mean earnings are
                         unknown, we can estimate the population means using a random sample of male
                         and female college graduates. Then the natural estimator of the unknown popula-
                         tion mean earnings for women, for example, is the average earnings of the female
                         college graduates in the sample.

                              The same idea extends to the linear regression model. We do not know the
                         population value of bClassSize, the slope of the unknown population regression line
                         relating X (class size) and Y (test scores). But just as it was possible to learn about
                         the population mean using a sample of data drawn from that population, so is it
                         possible to learn about the population slope bClassSize using a sample of data.

                              The data we analyze here consist of test scores and class sizes in 1999 in 420
                         California school districts that serve kindergarten through eighth grade. The test
                         score is the districtwide average of reading and math scores for fifth graders. Class
                         size can be measured in various ways. The measure used here is one of the broadest,
                         which is the number of students in the district divided by the number of teachers—
                         that is, the districtwide student–teacher ratio. These data are described in more
                         detail in Appendix 4.1.

                              Table 4.1 summarizes the distributions of test scores and class sizes for this sam-
                         ple. The average student–teacher ratio is 19.6 students per teacher, and the standard
                         deviation is 1.9 students per teacher. The 10th percentile of the distribution of the
   156   157   158   159   160   161   162   163   164   165   166