On a non-parametric confidence interval for the regression slope
- PDF / 498,402 Bytes
- 11 Pages / 439.37 x 666.142 pts Page_size
- 64 Downloads / 198 Views
On a non-parametric confidence interval for the regression slope Róbert Tóth1 · Ján Somorˇcík2
Received: 17 August 2016 / Accepted: 8 April 2017 © Sapienza Università di Roma 2017
Abstract We investigate an application of the Tukey’s methodology in Theil’s regression to obtain a confidence interval for the true slope in the straight line regression model with not necessarily normal errors. This specific approach is implemented since 2005 in an R package; however, without any theoretical background. We illustrate by Monte Carlo, that this methodology, unlike the classical Theil’s approach, seriously deflates the true confidence level of the resulting interval. We provide also rigorous proofs in case of four (in general) and five data points (under some additional conditions); together with a real life usage example in the latter case. Summing up, we demonstrate that one should never combine statistical methods without checking the assumptions of their usage and we also give a warning to the already wide community of R users of Theil’s regression from various fields of science. Keywords Theil’s regression · Tukey’s confidence interval · Walsh averages · Software R
1 Introduction The Theil’s regression (sometimes referred to as Theil–Sen regression) is a robust nonparametric replacement of the traditional least squares approach to the straight line regression model Y = β0 + β1 x + ε and also to some more complex linear regression models (see the pioneering three-piece article [23]). The Theil’s methodology does not require normality of the random errors ε, while being able to provide parameter estimates, tests of linear hypotheses, as well as confidence intervals (see e.g. [10] for a detailed description). We focus on the confidence interval (CI) for the true slope β1 . For the software R [20] there exists a package called mblm [13] that includes many tools of Theil’s regression. But, surprisingly, when asked for a CI for β1 , the package does not compute the classical Theil’s CI for β1 [23]. Instead, the package uses a different approach that utilizes without any
B
Ján Somorˇcík [email protected]
1
Tangent Works, Na Slavíne 1, 81104 Bratislava, Slovakia
2
Comenius University Bratislava, Mlynská dolina, 84248 Bratislava, Slovakia
123
R. Tóth, J. Somorˇcík
reference the well-known CI based on the Wilcoxon’s signed rank test. In general setting, the CI based on the Wilcoxon’s signed rank test has been ascribed to John Tukey (see [10] for historical details), who originally developed it to obtain a CI for the true center of symmetry of a symmetric distribution from which we observed a sample of independent and identically distributed data. However, it turns out (see Sect. 4), that in case of slope estimation in Theil’s regression the input data are definitely not independent. Therefore, the true confidence level of the resulting interval provided by the package mblm is of question and our paper shows that this negative premonition turns real. We think that it is important to point out and study this iss
Data Loading...