Generalised Estimation Equations

In this chapter, we analyse three data sets; California birds, owls, and deer. In the first data set, the response variable is the number of birds measured repeatedly over time at two-weekly intervals at the same locations. In the owl data set (Chapter 5)

  • PDF / 581,919 Bytes
  • 27 Pages / 439.37 x 666.142 pts Page_size
  • 84 Downloads / 172 Views

DOWNLOAD

REPORT


Generalised Estimation Equations

In this chapter, we analyse three data sets; California birds, owls, and deer. In the first data set, the response variable is the number of birds measured repeatedly over time at two-weekly intervals at the same locations. In the owl data set (Chapter 5), the response variable is the number of calls made by all offspring in the absence of the parent. We have multiple observations from the same nest, and 27 nests were sampled. In the deer data, the response variable is the presence or absence of parasites in a deer; the data are from multiple farms. In the first instance, we apply a generalised linear model (GLM) with a Poisson distribution for the California birds and owl data and a binomial GLM for the deer data. However, such analyses violate the independence assumption; for the California bird data, there is a longitudinal aspect, we have multiple observations per nest for the owls, and multiple deer from the same farm. We therefore introduce generalised estimation equations (GEE) as a tool to include a dependence structure, discuss its underlying mathematics, and apply it on the same data sets. GEE was introduced by Liang and Zeger (1986), and since their publication, several approaches have been developed to improve the technique. We use the original method as it is the simplest. Useful GEE references are Ziegler et al. (1996), Greene (1997), Fitzmaurice et al. (2004), and a textbook completely dedicated to GEE by Hardin and Hilbe (2002). This chapter heavily depends on the Fitzmaurice et al. (2004) book. Chapter 22 contains a binary GEE case study.

12.1 GLM: Ignoring the Dependence Structure 12.1.1 The California Bird Data Elphick and Oring (1998, 2003) and Elphick et al. (2007) analysed time series of several water bird species recorded in California rice fields. Their main goals were to determine whether flooding fields after harvesting results in greater use by aquatic birds, whether different methods of manipulating the straw in conjunction with flooding influences how many fields are used, and whether the depth that the A.F. Zuur et al., Mixed Effects Models and Extensions in Ecology with R, Statistics for Biology and Health, DOI 10.1007/978-0-387-87458-6 12,  C Springer Science+Business Media, LLC 2009

295

296

12

Generalised Estimation Equations

fields are flooded to is important. Biological details can be found in the references mentioned above. Counts were made during winter surveys at several fields. Here, we only use data measured from one winter (1993–1994), and we use species richness to summarise the 49 bird species recorded. The sampling took place at multiple sites, and from each site, multiple fields were repeatedly sampled. Here, we only use one site (called 4mile) for illustrative purposes. There are 11 fields in this site, and each field was repeatedly sampled; see Fig. 12.1. Note that there is a general decline in bird numbers over time. One of the available covariates is water depth per field, but water depth and time are collinear (as can be inf