Efficiency of Domain Mean Estimators in the Presence of Non-response Using Two-Stage Sampling with Non-linear and Linear

  • PDF / 1,394,945 Bytes
  • 26 Pages / 439.37 x 666.142 pts Page_size
  • 81 Downloads / 208 Views

DOWNLOAD

REPORT


Efficiency of Domain Mean Estimators in the Presence of Non‑response Using Two‑Stage Sampling with Non‑linear and Linear Cost Function David A. Alilah1   · C. O. Ouma2 · E. O. Ombaka3 Received: 12 May 2020 / Revised: 25 July 2020 / Accepted: 14 August 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract This study compares the efficiency of the estimators of the domain population mean computed using double sampling for ratio estimation design with linear, non-linear or logarithmic cost functions. In the estimation of the domain mean, information of the study and auxiliary variables which suffers from non-response at the second phase sampling has been used. Optimal stratum sample sizes for a given set of unit costs have been computed using Lagrangian multiplier partial differential equations and Taylor’s linearization series. The relative precision of the domain population mean estimate obtained has been compared using mean square error, relative efficiency, Bias and absolute relative bias. From the results obtained, the estimator of the domain population mean computed using auxiliary information had better efficiency on average compared to those computed without auxiliary information. It is also noted that of the three cost functions used in the computation of estimators, logarithmic cost function produced superior estimator compared to that computed using linear or non-linear (quadratic) cost function. This method of estimation can be used in estimating efficiently, infected population incidence rate by a given pandemic using optimal sample size at a minimum survey cost. Keywords  Double sampling for ratio estimation · Domain mean · Auxiliary variable · Non-linear cost function · Non-response and optimal allocation * David A. Alilah [email protected] C. O. Ouma [email protected] E. O. Ombaka [email protected] 1

Department of Mathematics, Masinde Muliro University of Science and Technology, Kakamega, Kenya

2

Department of Statistics and Actuarial Science, Kenyatta University, Nairobi, Kenya

3

Department of Mathematics, Garissa University, Garissa, Kenya



13

Vol.:(0123456789)



Annals of Data Science

1 Introduction A number of methods have been used to obtain domain samples from the target population. Among the popular methods used are the cross-sectional and longitudinal surveys. However, the caveats of using such approaches include under-reporting, bias, erratic survey cost and loss of follow up. To obtain the optimal samples of target population at a minimum cost using double sampling design, using auxiliary variables involves use of extremely large data sets that maybe analyzed computationally to reveal patterns, trends, associations and interactions. Further in sampling, there is need to identify a small number of influential factors to explain the variable of interest. Such variables are also known as the target variable or the study variable. Tang et al. [1] proposed a novel selection method which makes use of the properties in the frequency domain environme