Page 305 -
P. 305

304	 Chapter 8  Nonlinear Regression Functions

                         look at possible nonlinearities in the relationship between test scores and the
                         student–teacher ratio, holding student characteristics constant. In some applications,
                         the regression function is a nonlinear function of the X’s and of the parameters. If so,
                         the parameters cannot be estimated by OLS, but they can be estimated using
                         nonlinear least squares. Appendix 8.1 provides examples of such functions and
                         describes the nonlinear least squares estimator.

	 8.1	 A General Strategy for Modeling

               Nonlinear Regression Functions

                         This section lays out a general strategy for modeling nonlinear population regres-
                         sion functions. In this strategy, the nonlinear models are extensions of the multi-
                         ple regression model and therefore can be estimated and tested using the tools of
                         Chapters 6 and 7. First, however, we return to the California test score data and
                         consider the relationship between test scores and district income.

                   Test Scores and District Income

                         In Chapter 7, we found that the economic background of the students is an impor-
                         tant factor in explaining performance on standardized tests. That analysis used
                         two economic background variables (the percentage of students qualifying for a
                         subsidized lunch and the percentage of district families qualifying for income
                         assistance) to measure the fraction of students in the district coming from poor
                         families. A different, broader measure of economic background is the average
                         annual per capita income in the school district (“district income”). The California
                         data set includes district income measured in thousands of 1998 dollars. The sam-
                         ple contains a wide range of income levels: For the 420 districts in our sample, the
                         median district income is 13.7 (that is, $13,700 per person), and it ranges from 5.3
                         ($5300 per person) to 55.3 ($55,300 per person).

                              Figure 8.2 shows a scatterplot of fifth-grade test scores against district income
                         for the California data set, along with the OLS regression line relating these two
                         variables. Test scores and average income are strongly positively correlated, with
                         a correlation coefficient of 0.71; students from affluent districts do better on the
                         tests than students from poor districts. But this scatterplot has a peculiarity: Most
                         of the points are below the OLS line when income is very low (under $10,000) or
                         very high (over $40,000), but are above the line when income is between $15,000
                         and $30,000. There seems to be some curvature in the relationship between test
                         scores and income that is not captured by the linear regression.
   300   301   302   303   304   305   306   307   308   309   310