When Raters Talk, Rubrics Fall Silent

  • PDF / 384,454 Bytes
  • 17 Pages / 595.44 x 841.68 pts Page_size
  • 91 Downloads / 209 Views

DOWNLOAD

REPORT


Volume two, Issue four

October 2012

When Raters Talk, Rubrics Fall Silent MASOUMEH AHMADI SHIRAZI University of Tehran, Iran Bio Data: Masoomeh Ahmadi Shirazi received Ph.D. in Applied Linguistics at the University of Tehran in December 2008. She received her MA in TEFL from the same university in May 2003. She is now an assistant professor at the faculty of foreign languages in University of Tehran. She takes interest in academic writing, writing assessment, second language acquisition, vocabulary learning, research methodology and statistics. Abstract The research reported here suggests that raters, when involved in writing assessment, are more concerned with their own criteria to set a basis for their judgment rather than the standards provided by scale descriptors. This study sampled think aloud of eight raters who scored 15 essays in accord with Test of Written English (TWE) holistic scoring guide. Verbal report data indicated that just less than five percent of the statements made by the raters are related to the issues assessed in TWE. These findings background the utility of holistic rating scale descriptors, foregrounding the raters’ descriptors-independent judgments. Keywords: raters, rubrics, descriptors, TWE, holistic scales Introduction This study attempts to see how much of raters’ judgments is the reflection of scoring rubrics and descriptors. In fact, this article focuses on the degree of raters’ compliance with the scoring rubric. It is important to consider the role of rubrics and descriptors in writing assessment for they could contribute to higher reliability (Connor-Linton, 1995; DeRemer, 1998). The study collected raters’ judgments on written essays of a number of participants using Verbal Protocol Analysis (VPA) as the main instrument of the research. Raters’ verbalizations, then, were transcribed so that we can analyze these texts, looking for the rubrics and descriptors suggested both by the study and what other writing features are introduced by the raters to scoring procedure. In fact, we aim to illustrate that in spite of the crucial role that scoring guides play in controlling raters’ assessment behavior hence increasing reliability (ConnorLinton, 1995; Pollitt & Murray, 1996; DeRemer, 1998; Marby, 1999), they may get marginalized by the raters’ own criteria. The flexibility of descriptors and rubrics, as Norton Pierce (1991) has suggested, leaves some rooms for the rater to award the writing with features that are not part of the scoring guide resulting in lower 123 | P a g e

Language Testing in Asia

Volume two, Issue four

October 2012

reliability. This is especially true in holistic scoring where raters experience scoring guides which give them a bonus to include writing features in their assessment that are not specified by the scoring guide. Here we consider the raters’ idiosyncratic preferences as prior to holistic scoring guides which have a minor role in writing assessment processes. Review of Related Literature We can begin this section with the following question: “What