Page 158 -
P. 158
4.1 The Linear Regression Model 157
When you propose Equation (4.3) to the superintendent, she tells you that
something is wrong with this formulation. She points out that class size is just one
of many facets of elementary education and that two districts with the same class
sizes will have different test scores for many reasons. One district might have bet-
ter teachers or it might use better textbooks. Two districts with comparable class
sizes, teachers, and textbooks still might have very different student populations;
perhaps one district has more immigrants (and thus fewer native English speak-
ers) or wealthier families. Finally, she points out that even if two districts are the
same in all these ways they might have different test scores for essentially random
reasons having to do with the performance of the individual students on the day
of the test. She is right, of course; for all these reasons, Equation (4.3) will not hold
exactly for all districts. Instead, it should be viewed as a statement about a rela-
tionship that holds on average across the population of districts.
A version of this linear relationship that holds for each district must incorpo-
rate these other factors influencing test scores, including each district’s unique
characteristics (for example, quality of their teachers, background of their stu-
dents, how lucky the students were on test day). One approach would be to list
the most important factors and to introduce them explicitly into Equation (4.3)
(an idea we return to in Chapter 6). For now, however, we simply lump all these
“other factors” together and write the relationship for a given district as
TestScore = b0 + bClassSize * ClassSize + other factors. (4.4)
Thus the test score for the district is written in terms of one component,
b0 + bClassSize * ClassSize, that represents the average effect of class size on scores
in the population of school districts and a second component that represents all
other factors.
Although this discussion has focused on test scores and class size, the idea
expressed in Equation (4.4) is much more general, so it is useful to introduce more
general notation. Suppose you have a sample of n districts. Let Yi be the average
test score in the ith district, let Xi be the average class size in the ith district, and let
ui denote the other factors influencing the test score in the ith district. Then Equa-
tion (4.4) can be written more generally as
Yi = b0 + b1Xi + ui, (4.5)
for each district (that is, i = 1, c, n), where b0 is the intercept of this line and b1
is the slope. [The general notation b1 is used for the slope in Equation (4.5) instead
of bClassSize because this equation is written in terms of a general variable Xi.]

