A measurement error model approach to survey data integration: combining information from two surveys

  • PDF / 641,608 Bytes
  • 13 Pages / 439.37 x 666.142 pts Page_size
  • 109 Downloads / 184 Views

DOWNLOAD

REPORT


A measurement error model approach to survey data integration: combining information from two surveys Seho Park1 · Jae Kwang Kim1

· Diana Stukel2

Received: 10 July 2017 / Accepted: 11 September 2017 © Sapienza Università di Roma 2017

Abstract Combining information from several surveys from the same target population is an important practical problem in survey sampling. The paper is motivated by work that authors undertook, sponsored by the Food and Nutrition Technical Assistance III Project (FANTA), with funding from the U.S. Agency for International Development (USAID) Bureau of Food Security (BFS). In the project, two surveys were conducted independently for some areas and we present a measurement error model approach to integrate mean estimates obtained from the two surveys. The predicted values for the counterfactual outcome are used to create composite estimates for the overlapped areas. An application of the technique to the project is provided. Keywords Counterfactual outcome · Composite estimate · Variance estimation

1 Introduction Survey integration is an emerging research area of statistics, which concerns combining information from two or more independent surveys to get improved estimates for various parameters of interest for the target population. One of the early applications of survey integration is the Consumer Expenditure Survey [20], where two survey vehicles (a Diary survey and a quarterly interview survey) were used to obtain improved estimates for the Diary survey items. Renssen and Nieuwenbroek [16], Merkouris [12,13], Wu [18] and Ybarra and Lohr [19] considered the problem of combining data from two independent surveys to estimate totals at the population and domain levels. Combining information from two or more independent surveys is a problem frequently encountered in survey sampling. One of the classical setups used to combine information

B

Jae Kwang Kim [email protected]

1

Iowa State University, Ames, IA, USA

2

FANTA III Project, FHI 360, Washington, DC, USA

123

S. Park et al. Table 1 Data structure for combining two surveys with measurement errors

x

y1

Survey A

o

o

Survey B

o

y2

o

is two-phase sampling, where the measurement x is observed in both surveys and the study variable y is observed only from one survey, say, in Survey A. There is no measurement for y in survey B. In this case, we can treat the union of Survey A and Survey B samples as a phase one sample and treat the Survey A sample as a phase two sample. Hidiroglou [6] formulated this problem and developed efficient estimation using a two-phase regression estimation method. Fuller [4], Legg and Fuller [11], and Kim and Rao [9] considered this problem as a missing data problem and developed mass imputation to obtain improved estimation for the total as well as domain totals. Our setup is different from the two-phase sampling approach in the sense that we have a different measurement of y from two surveys. We consider a situation where two surveys have common measurement for x but different measurements for