Validity Evidence in Science Achievement Assessments Found in a Sample of Published Research Articles on Science Teachin

  • PDF / 343,613 Bytes
  • 16 Pages / 439.37 x 666.142 pts Page_size
  • 27 Downloads / 133 Views

DOWNLOAD

REPORT


Validity Evidence in Science Achievement Assessments Found in a Sample of Published Research Articles on Science Teaching Gabriel M. Della-Piana 1 & Michael K. Gardner 1 & Zachary R. Mayne 1 # Springer Nature Switzerland AG 2020

Abstract The authors add to published empirical findings that an appropriate range of validity evidence for achievement tests, including science and mathematics, has either not been gathered, not reported, or is not accessible for independent review. The current study focuses on a sample of published, peer reviewed science intervention studies from a single journal using science achievement measures, over an 11-years period, with some discussion of validity, and finds results similar to previous studies that were conducted in other STEM areas and broader contexts. The consensus of the educational measurement profession is that validity is assessed by the extent to which evidence and theory support the proposed interpretation of test scores for proposed uses. What is troublesome is that a shortfall in validity evidence raises concerns about faulty, or insubstantial, test score interpretation to inform student short-term and long-term education and career trajectories, and to inform curricula intervention improvements. Discussion cautions readers that though the study findings report shortfalls in some kinds of validity evidence, this simply raises a flag for the test user to consider what kinds of validity evidence apply to her test use. It also raises a flag for the profession to explore the reasons for shortfalls. Keywords Validity . Science assessment . Science achievement testing

Valid interpretation of the results of assessment of science education interventions depends not only on the validity of the study design, but also on the validity of the student achievement outcome measures. The professional consensus on Standards for Educational and Psychological Testing (American Educational Research Association, Electronic supplementary material The online version of this article (https://doi.org/10.1007/s41979-02000029-9) contains supplementary material, which is available to authorized users.

* Michael K. Gardner [email protected]

1

Department of Educational Psychology, University of Utah, Salt Lake City, UT, USA

Journal for STEM Education Research

American Psychological Association, National Council on Measurement in Education 2014; hereafter, Standards) notes that, “Validity is … the most fundamental consideration in developing tests and evaluating tests” (p. 11). The reason for this judgment is that validity is “the degree to which evidence and theory support the interpretations of test scores for proposed uses of tests” (p. 11). Thus, shortfalls in gathering appropriate validity evidence for tests may result in achievement tests of unknown quality, or faulty or insubstantial test score interpretation and use, as will be demonstrated in the paper. For example, if even a slight paraphrasing of a test question leads to different results by the test taker one is left with an un