Meet the Exponential Family

In Chapters 2 and 3 and in Appendix A, linear regression and additive modelling were discussed and various extensions allowing for different variances, nested data, temporal correlation, and spatial correlation were then discussed in Chapters 4, 5, 6, and

  • PDF / 850,685 Bytes
  • 16 Pages / 439.37 x 666.142 pts Page_size
  • 53 Downloads / 202 Views

DOWNLOAD

REPORT


Meet the Exponential Family

8.1 Introduction In Chapters 2 and 3 and in Appendix A, linear regression and additive modelling were discussed and various extensions allowing for different variances, nested data, temporal correlation, and spatial correlation were then discussed in Chapters 4, 5, 6, and 7. In Chapters 8, 9, and 10, we discuss generalised linear modelling (GLM) and generalised additive modelling (GAM) techniques. In linear regression and additive modelling, we use the Normal (or: Gaussian) distribution. It is important to realise that this distribution applies for the response variable. GLM and GAM are extensions of linear and additive modelling in the sense that a non-Gaussian distribution for the response variable is used and the relationship (or link) between the response variable and the explanatory variables may be different. In this chapter, we focus on the first point, the distribution. There are many reasons for using GLM and GAM instead of linear regression and additive modelling. Absence–presence data are (generally) coded as 1 and 0, proportional data are always between 0 and 100%, and count data are always non-negative. The GLM and GAM models used for 0−1 and proportional data are typically based on the Bernoulli and binomial distributions and for count data the Poisson and negative binomial distributions are common options. For continuous data, the Gaussian distribution is the most used distribution, but you can also use the gamma distribution. So before using GLMs and GAMs, we should focus on the questions: What are these distributions, how do they look like, and when would you use them? These three questions form the basis of this chapter. We devote an entire chapter to this topic because in our experience few of our students have been familiar with Poisson, negative binomial or gamma distributions, and some level of familiarity is required before entering the world of GLMs and GAMs in the next chapter. As we will see in the next chapter, a GLM (or GAM) consists of three steps: (i) choosing a distribution for the response variable, (ii) defining the systematic part in terms of covariates, and (iii) specifying the relationship (or: link) between the expected value of the response variable and the systematic part. This means that we have to stop for a moment and think about the nature of the response variable. A.F. Zuur et al., Mixed Effects Models and Extensions in Ecology with R, Statistics for Biology and Health, DOI 10.1007/978-0-387-87458-6 8,  C Springer Science+Business Media, LLC 2009

193

194

8

Meet the Exponential Family

In most statistics textbooks and undergraduate statistics courses, only the Normal, Poisson, and binomial distributions are discussed in any detail. However, there are various other distributions that are equally interesting for ecological data, for example, the negative binomial distribution. These are useful if the ‘ordinary’ GLMs do not work, and in practise, this is quite often in ecological data analysis. Useful references for distributions within the