Maximum Likelihood Under Incomplete Information: Toward a Comparison of Criteria

Maximum likelihood is a standard approach to computing a probability distribution that best fits a given dataset. However, when datasets are incomplete or contain imprecise data, depending on the purpose, a major issue is to properly define the likelihood

  • PDF / 133,359 Bytes
  • 8 Pages / 439.37 x 666.142 pts Page_size
  • 9 Downloads / 249 Views

DOWNLOAD

REPORT


Abstract Maximum likelihood is a standard approach to computing a probability distribution that best fits a given dataset. However, when datasets are incomplete or contain imprecise data, depending on the purpose, a major issue is to properly define the likelihood function to be maximized. This paper compares several proposals in terms of their intuitive appeal, showing their anomalous behavior on examples.

1 Introduction Edwards ([6], p. 9) defines a likelihood function as being proportional to the probability of obtaining results given a hypothesis, according to a probability model. A fundamental axiom is that the probability of obtaining at least one among two results is the sum of the probabilities of obtaining each of these results. In particular, a result in the sense of Edwards is not any kind of event, it is an elementary event. Only elementary events can be observed. For instance, when tossing a die, and seeing the outcome, you cannot observe the event “odd”, you can only see 1, 3 or 5. If this point of view is accepted, what becomes of the likelihood function under incomplete or imprecise observations? To properly answer this question, one must understand what is a result in this context. Namely, if we are interested in a certain random phenomenon, observations we get in this case do not directly inform us about the underlying random variables. Due to the interference with an imperfect measurement process, observations will be set-valued. So, in order to properly exploit such incomplete information, we must first decide what to model: 1. the random phenomenon through its measurement process; 2. or the random phenomenon despite its measurement process. I. Couso (B) Department of Statistics, Universidad de Oviedo, Oviedo, Spain e-mail: [email protected] D. Dubois IRIT, Université Paul Sabatier, Toulouse, France e-mail: [email protected] © Springer International Publishing Switzerland 2017 M.B. Ferraro et al. (eds.), Soft Methods for Data Science, Advances in Intelligent Systems and Computing 456, DOI 10.1007/978-3-319-42972-4_18

141

142

I. Couso and D. Dubois

In the first case, imprecise observations are considered as results, and we can construct the likelihood function of a random set, whose realizations contain precise but ill-known realizations of the random variable of interest. Actually, most authors are interested in the other point of view, consider that outcomes are the precise, although ill-observed, realizations of the random phenomenon. However in this case there are as many likelihood functions as precise datasets in agreement with the imprecise observations. Authors have proposed several ways of addressing this issue. The most traditional approach is based on the EM algorithm and it comes down to constructing a fake sample of the ill-observed random variable in agreement with the imprecise data, and maximizing likelihood wrt this sample. In this paper we analyze this methodology in the light of the epistemic approach to statistical reasoning outlined in [1] and compare it with several recent p