Learning Multicriteria Utility Functions with Random Utility Models

In traditional multicriteria decision analysis, decision maker evaluations or comparisons are considered to be error-free. In particular, algorithms like UTA*, ACUTA or UTA-GMS for learning utility functions to rank a set of alternatives assume that decis

  • PDF / 261,595 Bytes
  • 15 Pages / 439.363 x 666.131 pts Page_size
  • 66 Downloads / 242 Views

DOWNLOAD

REPORT


2

BIT Advanced Development, SAP, Sophia Antipolis, France [email protected] Dept. of Mathematics & Operations Research, Faculté Polytechnique, Université de Mons, Belgium

Abstract. In traditional multicriteria decision analysis, decision maker evaluations or comparisons are considered to be error-free. In particular, algorithms like UTA*, ACUTA or UTA-GMS for learning utility functions to rank a set of alternatives assume that decision maker(s) are able to provide fully reliable training data in the form of e.g. pairwise preferences. In this paper we relax this assumption by attaching a likelihood degree to each ordered pair in the training set; this likelihood degree can be interpreted as a choice probability (group decision making perspective) or, alternatively, as a degree of confidence about pairwise preferences (single decision maker perspective). Since binary choice probabilities reflect order relations, the former can be used to train algorithms for learning utility functions. We specifically address the learning of piecewise linear additive utility functions through a logistic distribution; we conclude with examples and use-cases to illustrate the validity and relevance of our proposal.

1

Introduction

Preference learning consists in determining a model that reflects the subjective value, i.e. as perceived by a decision maker (DM), of alternatives or items belonging to a set S (the reader can refer to Fürnkranz & Hüllermeier, 2011, for a general introduction on the topic). In artificial intelligence and decision theory, this problem is frequently solved by learning a value or utility function u such that the order obtained by ranking the alternatives by order of decreasing utility corresponds to the order induced on S by the preferences of the DM. Typically, the preference relation on S is not entirely known; therefore, it is common practice to obtain a sample of the preference relation on a subset SL ⊂ S – the learning set – to train the utility model. The thereby obtained utility function can then be used to evaluate alternatives in S \ SL and to obtain an estimate of the preference relation on S as a whole, thereby allowing to solve ranking or choice problems on S. In multicriteria decision theory, alternatives are characterized by their performances of several criteria; in this context, the preference learning problem aims at producing a utility function that evaluates items as a function of their ‘scores’ P. Perny, M. Pirlot, and A. Tsoukiàs (Eds.): ADT 2013, LNAI 8176, pp. 101–115, 2013. c Springer-Verlag Berlin Heidelberg 2013 

102

G. Bous and M. Pirlot

on the criteria and that reflects the preference relation of the DM. In other terms, the ‘global utility’ u(i) of an item i ∈ S is an aggregate of its scores on all criteria and the order induced by u on S (or SL ) is the same as the one induced by the preferences. The most common aggregation model in decision theoretic literature is the additive value model, which, starting with the contribution of Jacquet-Lagrèze & Siskos (1982), has led to a