Improved ridge regression estimators for the logistic regression model

  • PDF / 635,533 Bytes
  • 40 Pages / 439.37 x 666.142 pts Page_size
  • 64 Downloads / 277 Views

DOWNLOAD

REPORT


Improved ridge regression estimators for the logistic regression model A. K. Md. E. Saleh · B. M. Golam Kibria

Received: 2 April 2012 / Accepted: 26 March 2013 © Springer-Verlag Berlin Heidelberg 2013

Abstract The estimation of the regression parameters for the ill-conditioned logistic regression model is considered in this paper. We proposed five ridge regression (RR) estimators, namely, unrestricted RR, restricted ridge regression, preliminary test RR, shrinkage ridge regression and positive rule RR estimators for estimating the parameters (β) when it is suspected that the parameter β may belong to a linear subspace defined by Hβ = h. Asymptotic properties of the estimators are studied with respect to quadratic risks. The performances of the proposed estimators are compared based on the quadratic bias and risk functions under both null and alternative hypotheses, which specify certain restrictions on the regression parameters. The conditions of superiority of the proposed estimators for departure and ridge parameters are given. Some graphical representations and efficiency analysis have been presented which support the findings of the paper. Keywords

Dominance · Efficiency · Pre-test · Risk function · Stein-rule estimator

1 Introduction Logistic regression is a popular method to model binary data in biostatistics and health sciences. Unstable parameter estimates occur when the number of covariates is relatively large or when the covariates are highly correlated. This paper will deal with

A. K. Md. E. Saleh School of Mathematics and Statistics, Carleton University, Ottawa K1S 5B6, Canada B. M. G. Kibria (B) Department of Mathematics and Statistics, Florida International University, Miami, FL 33199, USA e-mail: [email protected]

123

A. K. Md. E. Saleh, B. M. G. Kibria

the estimation of the parameters for the logistic regression model when the covariates are highly correlated. To describe the problem, let Yi ∈ {0, 1} denote the dichotomous dependent variable and let xi = (1, x1i , x2i , . . . , x pi ) be a (p+1)-dimensional vector of explanatory variables for the ith observation. The conditional probability of Yi = 1 given xi is given by 

P(Yi = 1|xi ) = π(xi ) = (1 + e−β xi )−1

(1.1)

where β = (β0 , β1 , . . . , β p ) is the (p+1)-vector regression parameter of interest. The logit transformation in terms of π(xi ) is given by  ln

π(xi ) 1 − π(xi )



= β  xi = (β0 + β1 x1i + · · · + β p x pi )

(1.2)

Our primary objective is to estimate β = (β0 , β1 , . . . , β p ) when it is suspected that β belongs to the linear sub-space defined by Hβ = h, where H is a q × p matrix of known real values with rank q and h is a q-vector of known real values. Most common method of estimation of β is the maximum likelihood method (ML method). The ML equation is given by X T (Y − π(x)) = 0,

(1.3)

where X is the n × ( p + 1) matrix of covariates, Y = (Y1 , Y2 , . . . , Yn ) is n-vector of binary variables and π(x) = (π(x1 ), . . . , π(xn )) . Since (1.3) is nonlinear in β, one have to use iterative method to solv