Model identification and selection for single-index varying-coefficient models
- PDF / 2,470,020 Bytes
- 24 Pages / 439.37 x 666.142 pts Page_size
- 24 Downloads / 179 Views
Model identification and selection for single‑index varying‑coefficient models Peng Lai1 · Fangjian Wang1 · Tingyu Zhu2 · Qingzhao Zhang3 Received: 9 January 2018 / Revised: 5 March 2020 © The Institute of Statistical Mathematics, Tokyo 2020
Abstract Single-index varying-coefficient models include many types of popular semiparametric models, i.e., single-index models, partially linear models, varying coefficient models, and so on. In this paper, a two-stage efficient variable selection procedure is proposed to select important nonparametric and parametric components and obtain estimators simultaneously. We also find that the proposed procedure can separate predictors into varying-coefficient and constant-coefficient predictors automatically. Theoretically, it has the selection and estimation consistency properties. Simulation studies and a real data application are conducted to evaluate and illustrate the proposed methods. Keywords Efficient estimating equation · Group LASSO · Single-index varyingcoefficient model · Variable selection
1 Introduction Consider a single-index varying-coefficient model of the form (1)
Y = g⊤ (X ⊤ 𝛽)Z + 𝜀,
where X ∈ Rp and Z ∈ Rq are vectors of covariates, Y is the response variable, 𝛽 is a p × 1 vector of unknown parameters with ‖𝛽‖ = 1 and its first component ‖ ⋅ ‖ denotes the Euclidean metric), being positive for the sake of identifiability (
* Qingzhao Zhang [email protected] 1
School of Mathematics and Statistics, Nanjing University of Information Science & Technology, Nanjing 210044, China
2
Department of Statistics, Oregon State University, Corvallis, OR 97331, USA
3
Department of Statistics, School of Economics, The Wang Yanan Institute for Studies in Economics, MOE Key Lab of Economics and Fujian Key Lab of Statistics, Xiamen University, Xiamen 361005, China
13
Vol.:(0123456789)
P. Lai et al.
g(⋅) = (g1 (⋅), … , gq (⋅))⊤ is a q × 1 vector of unknown functions and 𝜀 is a random error with E(𝜀|X, Z) = 0 and Var(𝜀|X, Z) = 𝜎 2 < ∞ . Model (1) includes many important statistical models such as the linear regression model, varying-coefficient model and single-index model. More details refer to Xue and Wang (2012) and Lai et al. (2016). In this work, we are interested in estimating parametric coefficients 𝛽 and functions g(⋅) , where 𝛽 and g(⋅) are sparse in the sense that some of their elements are zero, and some gk (⋅) ’s may be nonzero constants. Sparsity plays a crucial role in high dimensional analysis, as it can improve interpretability and the accuracy of prediction. In addition, separation of the varying and constant effects have important implications, for example, in gene-environment interaction studies (Wu et al. 2014, 2015, 2018). Many studies have investigated statistical inference for single-index varying-coefficient models, such as Xue and Wang (2012), Xue and Pang (2013), Huang and Zhang (2013), and so on. However, these methods give nonzero estimates to all coefficients. Various penalization methods that can automatically select relevant parame
Data Loading...