Modified empirical likelihood-based confidence intervals for data containing many zero observations
- PDF / 528,750 Bytes
- 24 Pages / 439.37 x 666.142 pts Page_size
- 68 Downloads / 182 Views
Modified empirical likelihood-based confidence intervals for data containing many zero observations Patrick Stewart1 · Wei Ning1,2 Received: 8 November 2019 / Accepted: 23 April 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020
Abstract Data containing many zeroes is popular in statistical applications, such as survey data. A confidence interval based on the traditional normal approximation may lead to poor coverage probabilities, especially when the nonzero values are highly skewed and the sample size is small or moderately large. The empirical likelihood (EL), a powerful nonparametric method, was proposed to construct confidence intervals under such a scenario. However, the traditional empirical likelihood experiences the issue of undercoverage problem which causes the coverage probability of the EL-based confidence intervals to be lower than the nominal level, especially in small sample sizes. In this paper, we investigate the numerical performance of three modified versions of the EL: the adjusted empirical likelihood, the transformed empirical likelihood, and the transformed adjusted empirical likelihood for data with various sample sizes and various proportions of zero values. Asymptotic distributions of the likelihood-type statistics have been established as the standard chi-square distribution. Simulations are conducted to compare coverage probabilities with other existing methods under different distributions. Real data has been given to illustrate the procedure of constructing confidence intervals. Keywords Zero observations · Adjusted empirical likelihood · Transformed empirical likelihood · Transformed adjusted empirical likelihood · Confidence intervals · Coverage probability
B
Wei Ning [email protected]
1
Department of Mathematics and Statistics, Bowling Green State University, Bowling Green, OH 43403, USA
2
School of Mathematics and Statistics, Beijing Institute of Technology, Beijing 100081, China
123
P. Stewart, W. Ning
1 Introduction Many statistical applications contain a significant proportion of zero values. Many fields of science, such as ecology and medicine, produce data that contains zeroinflated, positively skewed populations. In general, a confidence interval can be constructed based on the normal approximation with the corresponding point estimator through the Central Limit Theorem. However, the validity of asymptotic normality is doubtful for highly skewed data unless the sample size is sufficiently large. As a consequence, it may lead to lower coverage probability than the nominal level. Cox and Snell (1979) made a parametric inference on the total population error with the assumption of positivity when data contain many zeros. Tamura (1988) pointed out that none of the parametric distributions are suitable for modeling the error distribution. Kvanli et al. (1998) considered a mixture model of zero and one of the commonly used parametric distributions to construct the confidence interval for the population mean under this scenario. However, they pointed out that
Data Loading...