Things are not Always Linear; Additive Modelling

In the previous chapter, we looked at linear regression, and although the word linear implies modelling only linear relationships, this is not necessarily the case. A model of the form Y i = α + β 1 × X i + β 2 × X i 2 + ɛ i is a linear regression model,

  • PDF / 1,748,081 Bytes
  • 35 Pages / 439.37 x 666.142 pts Page_size
  • 100 Downloads / 173 Views

DOWNLOAD

REPORT


Things Are Not Always Linear; Additive Modelling

3.1 Introduction In the previous chapter, we looked at linear regression, and although the word linear implies modelling only linear relationships, this is not necessarily the case. A model of the form Yi = α + β 1 × Xi + β 2 × Xi 2 + εi is a linear regression model, but the relationship between Yi and Xi is modelled using a second-order polynomial function. The same holds if an interaction term is used. For example, in Chapter 2, we modelled the biomass of wedge clams as a function of length, month and the interaction between length and month. But a scatterplot between biomass and length may not necessarily show a linear pattern. The word ‘linear’ in linear regression basically means linear in the parameters. Hence, the following models are all linear regression models. • • • • •

Yi Yi Yi Yi Yi

= α + β1 = α + β1 = α + β1 = α + β1 = α + β1

× Xi + β 2 × Xi 2 + εi × log(Xi ) + εi × (Xi × Wi ) + εi × exp(Xi ) + εi × sin(Xi ) + εi

In all these models, we can define a new explanatory variable Zi such that we have a model of the form Yi = α + β 1 × Zi + εi . However, a model of the form Yi = α + β1 × X 1i × eβ2 ×X 2i +β3 ×X 3i + εi is not linear in the parameters. In Chapter 2, we also discussed assessing whether the linear regression model is suitable for your data by plotting the residuals against fitted values, and residuals against each explanatory variable. If in the biomass wedge clam example, the residuals are plotted against length, and there are clear patterns, then you have a serious problem. Options to fix this problem are as follows: • Extend the model with interactions terms. • Extend the model with a non-linear length effect (e.g. use length and length to the power of two as explanatory variables). A.F. Zuur et al., Mixed Effects Models and Extensions in Ecology with R, Statistics for Biology and Health, DOI 10.1007/978-0-387-87458-6 3,  C Springer Science+Business Media, LLC 2009

35

36

3

Things Are Not Always Linear

• Add more explanatory variables. • Transform the data to linearise the relationships. You can either transform the response variables or the explanatory variables. See, for example, Chapter 4 in Zuur et al. (2007) for guidance on this. An interesting discussion with arguments against transformations can be found in Keele (pg. 6–7, 2008). One of the arguments is that a transformation affects the entire Y – X relationship, whereas maybe the relationship is partly linear and also partly non-linear along the X gradient. Now suppose you have already added all possible explanatory variables, and interactions, but you still see patterns in the graph of residuals against individual explanatory variables, and you do not want to transform the variables. Then you need to move on from the linear regression model, and one alternative is to use smoothing models, the subject of this chapter. These models allow for non-linear relationships between the response variable and multiple explanatory variables and are also called additive models. They are