Modelling heterogeneity: on the problem of group comparisons with logistic regression and the potential of the heterogen
- PDF / 440,541 Bytes
- 26 Pages / 439.37 x 666.142 pts Page_size
- 100 Downloads / 171 Views
Modelling heterogeneity: on the problem of group comparisons with logistic regression and the potential of the heterogeneous choice model Gerhard Tutz1 Received: 22 March 2019 / Revised: 21 November 2019 / Accepted: 6 December 2019 © Springer-Verlag GmbH Germany, part of Springer Nature 2019
Abstract The comparison of coefficients of logit models obtained for different groups is widely considered as problematic because of possible heterogeneity of residual variances in latent variables. It is shown that the heterogeneous logit model can be used to account for this type of heterogeneity by considering reduced models that are identified. A model selection strategy is proposed that can distinguish between effects that are due to heterogeneity and substantial interaction effects. In contrast to the common understanding, the heterogeneous logit model is considered as a model that contains effect modifying terms, which are not necessarily linked to variances but can also represent other types of heterogeneity in the population. The alternative interpretation of the parameters in the heterogeneous logit model makes it a flexible tool that can account for various sources of heterogeneity. Although the model is typically derived from latent variables it is important that for the interpretation of parameters the reference to latent variables is not needed. Latent variables are considered as a motivation for binary models, but the effects in the models can be interpreted as effects on the binary response. Keywords Heterogeneous choice model · Location–scale model · Heterogeneity of variances · Logit model · Group comparisons · Non-contingent response style Mathematics Subject Classification 62J12 · 62H99 · 62P25
1 Introduction Allison (1999) demonstrated that comparisons of binary model coefficients across groups can be misleading if one has underlying heterogeneity of residual variances. If one compares the regression coefficients of a set of explanatory variables like age,
B 1
Gerhard Tutz [email protected] Ludwig-Maximilians-Universität München, Akademiestraße 1, 80799 Munich, Germany
123
G. Tutz
income, social status on a binary response one might find different coefficients for the gender groups although the effects of the explanatory variables on the response have equal strengths in the underlying model. The reason is that coefficients are confounded with differences in variation across gender groups. Since Allison’s paper the issue has been investigated in various papers, see Williams (2009), Mood (2010), Rohwer (2015), Karlson et al. (2012), Breen et al. (2014). More recently, Kuha and Mills (2017) (henceforth KM) tried to convince readers that the problem is much less serious. In their concluding remarks they say that if researchers make sure to be clear about their target quantities of their analysis “they will in most cases be able to conclude that comparisons of estimates from such models between different groups or between different models pose no fundamental problems or at least not the kinds of
Data Loading...