Semiparametric modeling of the right-censored time-series based on different censorship solution techniques

  • PDF / 2,720,038 Bytes
  • 30 Pages / 439.37 x 666.142 pts Page_size
  • 87 Downloads / 142 Views

DOWNLOAD

REPORT


Semiparametric modeling of the right-censored time-series based on different censorship solution techniques Dursun Aydın1 · Ersin Yılmaz1 Received: 28 February 2020 / Accepted: 16 September 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract In this paper, we employ the penalized spline method to estimate the components of a right-censored semiparametric time-series regression model with autoregressive errors. Because of the censoring, the parameters of such a model cannot be directly computed by ordinary statistical methods, and therefore, a transformation is required. In the context of this paper, we propose three different data transformation techniques, called Gaussian imputation (GI), k nearest neighbors (kNN) and Kaplan–Meier weights (KMW). Note that these data transformation methods, which are modified extensions of ordinary GI, kNN and KMW approximations, are used to adjust the censoring response variable in the setting of a time-series. In this sense, detailed Monte Carlo experiments and a real time-series data example are carried out to indicate the performances of the proposed approaches and to analyze the effects of different censoring levels and sample sizes. The obtained results reveal that the censored semiparametric time-series models based on kNN imputation often work better than those estimated by GI or KMW. Keywords Right-censored time-series · Gaussian imputation · kNN imputation · Kaplan–Meier weights · Penalized splines · Semiparametric regression

1 Introduction In econometrics and statistics literature, the term right-censored data is employed for observations that cannot be observed beyond a cutoff value. Generally, time-series measurements are often observed with data irregularities, such as observations due to a detection limit. Namely, some response observations exceeding the detection limit

B

Ersin Yılmaz [email protected] Dursun Aydın [email protected]

1

Department of Statistics, Faculty of Science, Mugla Sitki Kocman University, 48000 Mugla, Turkey

123

D. Aydın, E. Yılmaz

will not be known, and these incomplete observations will be recorded as the value of the detection limit. Depending on this issue, the known-classical semiparametric timeseries regression analysis cannot be directly applied to the right-censored data. Note that in the case of uncensored response observations, classical time-series regression models with autoregressive errors are analyzed by parametric methods. For instance, see Box and Jenkins (1970), Brockwell and Davis (1991) for more detailed discussions. In the presence of censoring, the estimates obtained from parametric methods are highly biased and unreliable. A way to handle this problem is to replace censored data points with reasonable values from observations of a data set via imputation methods. Note that imputation refers to the process of replacing the censored data with substituted values. Another way to cope with censorship data is to consider the weighted Kaplan–Meier estimator of the observed response varia