The Good, the Bad, the Difficult, and the Easy: Something Wrong with Information Retrieval Evaluation?

TREC-like evaluations do not consider topic ease and difficulty. However, it seems reasonable to reward good effectiveness on difficult topics more than good effectiveness on easy topics, and to penalize bad effectiveness on easy topics more than bad effe

PDF / 2,151,892 Bytes
5 Pages / 430 x 660 pts Page_size
99 Downloads / 191 Views

DOWNLOAD

REPORT

Abstract. TREC-like evaluations do not consider topic ease and diﬃculty. However, it seems reasonable to reward good eﬀectiveness on diﬃcult topics more than good eﬀectiveness on easy topics, and to penalize bad eﬀectiveness on easy topics more than bad eﬀectiveness on diﬃcult topics. This paper shows how this approach leads to evaluation results that could be more reasonable, and that are diﬀerent to some extent. I provide a general analysis of this issue, propose a novel framework, and experimentally validate a part of it. Keywords: Evaluation, TREC, topic ease and diﬃculty.

1

Introduction

As lecturers, when we try to assess a student’s performance during an exam, we distinguish between easy and diﬃcult questions. When we ask easy questions to our students we expect correct answers; therefore, we give a rather mild positive evaluation if the answer to an easy question is correct, and we give a rather strong negative evaluation if the answer is wrong. Conversely, when we ask diﬃcult questions, we are quite keen to presume a wrong answer; therefore, we give a rather mild negative evaluation if the answer to a diﬃcult question is wrong, and we give a rather strong positive evaluation if the answer is correct. The diﬃculty amount of a question can be determined a priori (on the basis of lecturer’s knowledge of what and how has been taught to the students) or a posteriori (e.g., by averaging, in a written exam, the answer evaluations of all the students to the same question). Probably, a mixed approach (both a priori and a posteriori) is the most common choice. During oral examinations, when we have an idea of student’s preparation (e.g., because of a previous written exam, or a term project, or after having asked the ﬁrst questions), we even do something more: we ask diﬃcult questions to good students, and we ask easy questions to bad students. This sounds quite obvious too: what’s the point in asking easy questions to good students? They will almost C. Macdonald et al. (Eds.): ECIR 2008, LNCS 4956, pp. 642–646, 2008. c Springer-Verlag Berlin Heidelberg 2008

The Good, the Bad, the Diﬃcult, and the Easy

643

certainly answer correctly, as expected, without providing much information about their preparation. And what’s the point in asking diﬃcult questions to bad students? They will almost certainly answer wrongly, without providing much information — and incidentally increase examiner’s stress level. Therefore we can state the following principles, as “procedures” to be followed during student’s assessment: Easy and Diﬃcult Principle. Weight more (less) both (i) errors on easy (difﬁcult) questions and (ii) correct answers on diﬃcult (easy) questions. Good and Bad Principle. On the basis of an estimate of student’s preparation, ask (i) diﬃcult questions to good students and (ii) easy questions to bad students. I am not aware of any lecturer/teacher/examiner which would not agree with the two principles, and which would not behave accordingly, once enlightened by them. In Information Retrieval (IR) evaluation we a

Data Loading...

The Good, the Bad, the Difficult, and the Easy: Something Wrong with Information Retrieval Evaluation?

Recommend Documents

Rhetoric in Retreat (The Good, the Bad, and the Ridiculous)

The Good, the Bad and the Ugly Graffiti

The Good, the Bad, and the Ugly Data Teams

The Future of Information Retrieval Evaluation

The Bad News and the Good News About News

Granulomas in parasitic diseases: the good and the bad

Autoimmune effector memory T cells: the bad and the good

Weighing the Effects of Immigration on Canadian Prosperity: The Good, the Bad, and the Ugly

The good, the bad, and the opportunities of the complement system in neurodegenerative disease

Genetic tests obtainable through pharmacies: the good, the bad, and the ugly

Evaluation in Information Retrieval

Streamlining the Information Retrieval Process in the Drug Information Department