Methods of Test Validation

  • PDF / 165,061 Bytes
  • 20 Pages / 439.37 x 663.307 pts Page_size
  • 110 Downloads / 234 Views

DOWNLOAD

REPORT


METHODS OF TEST VALIDATION

INTRODUCTION

Test validation methods are at the heart of language testing research. Validity is a theoretical notion that defines the scope and the nature of validation work, whereas validation is the process of developing and evaluating evidence for a proposed score interpretation and use. The way validity is conceptualized determines the scope and the nature of validity investigations and hence the methods to gather evidence. Validation frameworks specify the process used to prioritize, integrate, and evaluate evidence collected using various methods. Therefore, this review delineates the evolution of validity theory and validation frameworks, and synthesizes the methodologies used to validate language tests. In general, developments of validity theories and validation frameworks in language testing have paralleled advances in educational measurement (Cronbach and Meehl, 1955; Cureton, 1951; Kane, 1992; Messick, 1989). Validation methods have been influenced by three areas in particular. Developments in psychometric and statistical methods in education have featured prominently in language testing research (Bachman, 2004; Bachman and Eignor, 1997). Qualitative methods in language testing (Banerjee and Luoma, 1997) have been well informed by second language acquisition (Bachman and Cohen, 1998), conversation analysis, and discourse analysis (Lazaraton, 2002). Research in cognitive psychology has also found its way into core language testing research, especially that regarding introspective methodologies (Green, 1997) and the influence of cognitive demands of tasks on task complexity and difficulty (Iwashita, McNamara, and Elder, 2001). EARLIER DEVELOPMENTS

The validation of the discrete-point language tests popular in the 1950s and 1960s, including language aptitude tests, was mostly couched in the validity conceptualization by Lado (1961). Taking up the term of criterion-related validity from educational measurement (Cureton, 1951), Lado argued that the validity of a language test can be established indirectly if scores on the test are reasonably correlated with E. Shohamy and N. H. Hornberger (eds), Encyclopedia of Language and Education, 2nd Edition, Volume 7: Language Testing and Assessment, 177–196. #2008 Springer Science+Business Media LLC.

178

XIAOMING XI

those of another test or criterion which is valid. When addressing item validity, Lado discussed the content and performance evaluation of multiple-choice items in particular. According to Lado, content validity concerns the degree to which an item contains a language problem that is representative of the problem in real life. The correlation between the performance on an item and on the same problem in the criterion measure constitutes criterion-related validity evidence. Seeing reliability as a prerequisite for validity, Lado introduced the concepts of test–retest reliability and internal consistency of test items. The 1970s witnessed a trend toward more direct and communicative language tests, yet the focus still centered s