Co-training with Credal Models

So-called credal classifiers offer an interesting approach when the reliability or robustness of predictions have to be guaranteed. Through the use of convex probability sets, they can select multiple classes as prediction when information is insufficient

  • PDF / 697,156 Bytes
  • 13 Pages / 439.37 x 666.142 pts Page_size
  • 87 Downloads / 214 Views

DOWNLOAD

REPORT


Abstract. So-called credal classifiers offer an interesting approach when the reliability or robustness of predictions have to be guaranteed. Through the use of convex probability sets, they can select multiple classes as prediction when information is insufficient and predict a unique class only when the available information is rich enough. The goal of this paper is to explore whether this particular feature can be used advantageously in the setting of co-training, in which a classifier strengthen another one by feeding it with new labeled data. We propose several co-training strategies to exploit the potential indeterminacy of credal classifiers and test them on several UCI datasets. We then compare the best strategy to the standard co-training process to check its efficiency. Keywords: Co-training · Imprecise probabilities learning · Ensemble models

1

·

Semi-supervised

Introduction

There are many application fields (gesture, human activity, finance, ...) where extracting numerous unlabeled data is easy, but where labeling them reliably require costly human efforts or an expertise that may be rare and expensive. In this case, getting a large labeled dataset is not possible, making the task of training an efficient classifier from labeled data alone difficult. The general goal of semi-supervised learning techniques [1,7,28] is to solve this issue by exploiting the information contained in unlabeled data. It includes different approaches such as the adaptation of training criteria [13,14,16], active learning methods [18] and co-training-like approaches [6,19,22]. In this paper, we focus on the co-training framework. This approach aims at training two classifiers in parallel, and each model then attempts to strengthen the other by labeling a selection of unlabeled data. We will call trainer the classifier providing new labeled instances and learner the classifier using it as new training data. In the standard co-training approach [6,22], the trainer provides to the learner the data about which it gets the most confident labels. However, those labels are predicted with high confidence by the trainer but it is not guaranteed that the new labeled instances will be informative for the learner, in the sense that it may not help him to improve its accuracy. c Springer International Publishing AG 2016  F. Schwenker et al. (Eds.): ANNPR 2016, LNAI 9896, pp. 92–104, 2016. DOI: 10.1007/978-3-319-46182-3 8

Co-training with Credal Models

93

To solve this issue, we propose a new co-training approach using credal classifiers. Such classifiers, through the use of convex sets of probabilities, can predict a set of labels when training data are insufficiently conclusive. It means they will produce a single label as prediction only when the information is enough (i.e., when the probability set is small enough). The basic idea of our approach is to select as potential new training data for the learner those instances for which the (credal) trainer has predicted a single label and the learner multiple ones.

2

Co-training Framework

We assume that sam