A Bootstrapping Assessment on A U.S. Education Indicator Construction Through Multiple Imputation
- PDF / 967,883 Bytes
- 12 Pages / 439.37 x 666.142 pts Page_size
- 7 Downloads / 156 Views
A Bootstrapping Assessment on A U.S. Education Indicator Construction Through Multiple Imputation Jianjun Wang1 Accepted: 27 September 2020 © Springer Nature B.V. 2020
Abstract Under a matrix sampling design, no students complete all test booklets in the National Assessment of Educational Progress (NAEP). To construct an education indicator on what students know and can do, multiple imputation (MI) is conducted to compute plausible values (PV) from student responses to a subset of the questions. Since 2013, NAEP increased the number of imputed PV from five to 20. A purpose of this investigation is to examine the impact of this NAEP change on indicator reporting. R algorithm is created to compute bootstrap standard errors of the PV distribution. The results show that the 20-imputation setting has reduced the standard error and improved normality in comparison to the fiveimputation setting. While the bootstrap technique is typically set to generate 1000 resamples, the findings from this study further indicate that an increase of the resampling number is unlikely to reduce the standard error estimate. Keywords Bootstrapping · Plausible value · National assessment of educational progress What students know and can do in the United States directly impacts the outcome of global market competition (Carnoy and Rothstein 2013). School quality, as represented by student preparation in mathematics and science, is an important education indicator. As a result, major international studies, such as Trend in Mathematics and Science Study (TIMSS), are modeled after large-scale data collections in National Assessment of Educational Progress (NAEP) (see Johnson et al. 2003). In part, this is because “In the social sciences, particularly in survey research, precision implies the need for sufficient sample size (to account for sampling error)” (Pokropek 2011, p. 81). To represent different school curricula across the nation, the NAEP instrument must include many items. Meanwhile, the test booklet for each student has to be “short enough not to exceed the student’s patience for the low-stakes assessment” (Johnson et al. 2003, p. 13). Unlike a high school graduation exam that impacts student future, the NAEP test has no influence on college admission or career choice. A lengthy test may
* Jianjun Wang [email protected] 1
Department of Advanced Educational Studies, California State University, 9001 Stockdale Highway, Bakersfield, CA 93311, USA
13
Vol.:(0123456789)
J. Wang
cause low student effort in answering questions, and thus, compromise the data quality for indicator reporting. To reduce the test burden, NAEP took an innovative approach in 1983 to combine questions into test booklets (Beaton 1987). Each student was given a subset of items in the test booklets, not all of which were in common with the subset of items received by another student. This procedure was named matrix sampling to systematically incorporate missing responses on these items not covered in a test booklet (Kaplan and Lee 2018). For instance, the 2015 NA
Data Loading...