Page 305 -

P. 305

304 Chapter 8 Nonlinear Regression Functions

look at possible nonlinearities in the relationship between test scores and the
student–teacher ratio, holding student characteristics constant. In some applications,
the regression function is a nonlinear function of the X’s and of the parameters. If so,
the parameters cannot be estimated by OLS, but they can be estimated using
nonlinear least squares. Appendix 8.1 provides examples of such functions and
describes the nonlinear least squares estimator.

8.1 A General Strategy for Modeling

Nonlinear Regression Functions

This section lays out a general strategy for modeling nonlinear population regres-
sion functions. In this strategy, the nonlinear models are extensions of the multi-
ple regression model and therefore can be estimated and tested using the tools of
Chapters 6 and 7. First, however, we return to the California test score data and
consider the relationship between test scores and district income.

Test Scores and District Income

In Chapter 7, we found that the economic background of the students is an impor-
tant factor in explaining performance on standardized tests. That analysis used
two economic background variables (the percentage of students qualifying for a
subsidized lunch and the percentage of district families qualifying for income
assistance) to measure the fraction of students in the district coming from poor
families. A different, broader measure of economic background is the average
annual per capita income in the school district (“district income”). The California
data set includes district income measured in thousands of 1998 dollars. The sam-
ple contains a wide range of income levels: For the 420 districts in our sample, the
median district income is 13.7 (that is, $13,700 per person), and it ranges from 5.3
($5300 per person) to 55.3 ($55,300 per person).

Figure 8.2 shows a scatterplot of fifth-grade test scores against district income
for the California data set, along with the OLS regression line relating these two
variables. Test scores and average income are strongly positively correlated, with
a correlation coefficient of 0.71; students from affluent districts do better on the
tests than students from poor districts. But this scatterplot has a peculiarity: Most
of the points are below the OLS line when income is very low (under $10,000) or
very high (over $40,000), but are above the line when income is between $15,000
and $30,000. There seems to be some curvature in the relationship between test
scores and income that is not captured by the linear regression.

300 301 302 303 304 305 306 307 308 309 310