Group Selection in Semiparametric Accelerated Failure Time Model
In survival analysis, a number of regression models can be used to estimate the effects of covariates on the censored survival outcome. When covariates can be naturally grouped, group selection is important in these models. Motivated by the group bridge a
- PDF / 520,418 Bytes
- 23 Pages / 439.36 x 666.15 pts Page_size
- 0 Downloads / 223 Views
Group Selection in Semiparametric Accelerated Failure Time Model Longlong Huang, Karen Kopciuk, and Xuewen Lu
Abstract In survival analysis, a number of regression models can be used to estimate the effects of covariates on the censored survival outcome. When covariates can be naturally grouped, group selection is important in these models. Motivated by the group bridge approach for variable selection in a multiple linear regression model, we consider group selection in a semiparametric accelerated failure time (AFT) model using Stute’s weighted least squares and a group bridge penalty. This method is able to simultaneously carry out feature selection at both the group and within-group individual variable levels, and enjoys the powerful oracle group selection property. Simulation studies indicate that the group bridge approach for the AFT model can correctly identify important groups and variables even with high censoring rate. A real data analysis is provided to illustrate the application of the proposed method.
5.1 Introduction Variable selection, an important objective of survival analysis, is to choose a minimum number of important variables to model the relationship between a lifetime response and potential risk factors. In an attempt to select significant variables and estimate regression coefficients automatically and simultaneously, a family of penalized or regularized approaches is proposed. Variable selection is conducted by minimizing a penalized objective function by adding a penalty
L. Huang () • X. Lu Department of Mathematics and Statistics, University of Calgary, 2500 University Drive NW, T2N 1N4, Calgary, AB, Canada e-mail: [email protected]; [email protected] K. Kopciuk Department of Cancer Epidemiology and Prevention Research, Alberta Health Services, 5th Floor, Holy Cross Centre Box ACB, 2210 2 St. SW, T2S 3C3, Calgary, AB, Canada e-mail: [email protected] © Springer Science+Business Media Singapore 2016 D.-G. (Din) Chen et al. (eds.), Advanced Statistical Methods in Data Science, ICSA Book Series in Statistics, DOI 10.1007/978-981-10-2594-5_5
77
78
L. Huang et al.
function with the following form min fLoss function C Penaltyg : The popular choices of loss functions are least squares and negative log-likelihood. Many different penalty functions have been used for penalized regression, such as the least absolute shrinkage and selection operator (LASSO) (Tibshirani 1996), the bridge penalty (Fu 1998), the smoothly clipped absolute deviation (SCAD) method (Fan and Li 2001), the elastic-net method (Zou and Hastie 2005), the minimax concave penalty (MCP) (Zhang 2010) and the smooth integration of counting and absolute deviation (SICA) method (Lv and Fan 2009). These methods are designed for individual variables selection. In many applications, covariates in X are grouped. For example, in multi-factor analysis of variance (ANOVA) problem, in which each factor may have several levels and can be expressed through a group of dummy variables, such as for response Z with
Data Loading...