Mixed Effects Modelling for Nested Data

In this chapter, we continue with Gaussian linear and additive mixed modelling methods and discuss their application on nested data. Nested data is also referred to as hierarchical data or multilevel data in other scientific fields (Snijders and Boskers,

  • PDF / 1,243,115 Bytes
  • 42 Pages / 439.37 x 666.142 pts Page_size
  • 65 Downloads / 203 Views

DOWNLOAD

REPORT


Mixed Effects Modelling for Nested Data

In this chapter, we continue with Gaussian linear and additive mixed modelling methods and discuss their application on nested data. Nested data is also referred to as hierarchical data or multilevel data in other scientific fields (Snijders and Boskers, 1999; Raudenbush and Bryk, 2002). In the first section of this chapter, we give an outline to mixed effects models for nested data before moving on to a formal introduction in the second section. Several different types of mixed effects models are presented, followed by a section discussing the induced correlation structure between observations. Maximum likelihood and restricted maximum likelihood estimation methods are discussed in Section 5.6. The material presented in Section 5.6 is more technical, and you need only skim through it if you are not interested in the mathematical details. Model selection and model validation tools are presented in Sections 5.7, 5.8, and 5.9. A detailed example is presented in Section 5.10.

5.1 Introduction Zuur et al. (2007) used marine benthic data from nine inter-tidal areas along the Dutch coast. The data were collected by the Dutch institute RIKZ in the summer of 2002. In each inter-tidal area (denoted by ‘beach’), five samples were taken, and the macro-fauna and abiotic variables were measured. Zuur et al. (2007) used species richness (the number of different species) and NAP (the height of a sampling station compared to mean tidal level) from these data to illustrate statistical methods like linear regression and mixed effects modelling. Here, we use the same data, but from a slightly different pedagogical angle. Mixed modelling may not be the optimal statistical technique to analyse these data, but it is a useful data set for our purposes. It is relatively small, and it shows all the characteristics of a data set that needs a mixed effects modelling approach. The underlying question for these data is whether there is a relationship between species richness, exposure, and NAP. Exposure is an index composed of the following elements: wave action, length of the surf zone, slope, grain size, and the depth of the anaerobic layer. A.F. Zuur et al., Mixed Effects Models and Extensions in Ecology with R, Statistics for Biology and Health, DOI 10.1007/978-0-387-87458-6 5,  C Springer Science+Business Media, LLC 2009

101

102

5

Mixed Effects Modelling for Nested Data

As species richness is a count (number of different species), a generalised linear model (GLM) with a Poisson distribution may be appropriate. However, we want to keep things simple for now; so we begin with a linear regression model with the Gaussian distribution and leave using Poisson GLMs until later. A first candidate model for the data is Rij = α + β1 × NAPij + β2 × Exposurei + εij

εij ∼ N (0, σ 2 )

(5.1)

Rij is the species richness at site j on beach i, NAPij the corresponding NAP value, Exposurei the exposure on beach i, and εij the unexplained information. Indeed, this is the familiar linear regression model. The