Computing Information Retrieval Performance Measures Efficiently in the Presence of Tied Scores

The Information Retrieval community uses a variety of performance measures to evaluate the effectiveness of scoring functions. In this paper, we show how to adapt six popular measures — precision, recall, F1, average precision, reciprocal rank, and normal

PDF / 324,035 Bytes
8 Pages / 430 x 660 pts Page_size
53 Downloads / 251 Views

DOWNLOAD

REPORT

Abstract. The Information Retrieval community uses a variety of performance measures to evaluate the eﬀectiveness of scoring functions. In this paper, we show how to adapt six popular measures — precision, recall, F1, average precision, reciprocal rank, and normalized discounted cumulative gain — to cope with scoring functions that are likely to assign many tied scores to the results of a search. Tied scores impose only a partial ordering on the results, meaning that there are multiple possible orderings of the result set, each one performing diﬀerently. One approach to cope with ties would be to average the performance values across all possible result orderings; but unfortunately, generating result permutations requires super-exponential time. The approach presented in this paper computes precisely the same performance value as the approach of averaging over all permutations, but does so as eﬃciently as the original, tie-oblivious measures.

1

Introduction

One of the fundamental problems in Information Retrieval is the ranking problem: Ordering the results of a query such that the most relevant results show up ﬁrst. Ranking algorithms employ scoring functions that assign scores to each result of a query, where the score is an estimate of the result’s relevance to the query at hand. So, ranking the results of a query consists of assigning a score to each result and then sorting the results by score, from highest to lowest. Ranking algorithms are typically evaluated against a test collection consisting of a set of queries. Each query in the test collection has a set of results, and the results were arranged into a (partial or total) order by a human judge. In order to evaluate a ranking algorithm, the algorithm is applied to the result set of each query, the distance between the computed ranking and the “optimal” ordering determined by the judge is measured, and the distances are averaged over the entire test collection. Coming up with suitable distance metrics (or performance measures) has been the subject of considerable research, and there are numerous such metrics. Typically these performance measures assume that a ranking algorithm arranges the results of a query into a total ordering, i.e. that no two results to a query have the same score. This assumption is reasonable for scoring functions that map a rich set of features of the result document to a real-valued score, but C. Macdonald et al. (Eds.): ECIR 2008, LNCS 4956, pp. 414–421, 2008. c Springer-Verlag Berlin Heidelberg 2008

Computing Information Retrieval Performance Measures Eﬃciently

415

it is less warranted for evaluating the performance of a single discrete feature, e.g. page in-degree, click count, and page visits. We stress that we are not concerned with evaluating the ﬁnal results of a ranking system, which will almost certainly combine many features into a realvalued score, but rather the internals of such a system, where the performance of individual, potentially discrete features needs to be assessed. A typical modern search engine will

Data Loading...

Computing Information Retrieval Performance Measures Efficiently in the Presence of Tied Scores

Recommend Documents

Soft Computing in Information Retrieval Techniques and Applications

Soft Computing in Web Information Retrieval Models and Applications

Multidisciplinary Information Retrieval 6th Information Retrieval Fa

Multidisciplinary Information Retrieval 5th Information Retrieval Fa

Information Retrieval Technology Asia Information Retrieval Symposiu

Multidisciplinary Information Retrieval 7th Information Retrieval Fa

Multidisciplinary Information Retrieval Second Information Retrieval

Evaluation in Information Retrieval

Advances in Multidisciplinary Retrieval First Information Retrieval

Streamlining the Information Retrieval Process in the Drug Information Department

Information Fusion in Multimedia Information Retrieval

Information Retrieval