Joint Semi-supervised Similarity Learning for Linear Classification
The importance of metrics in machine learning has attracted a growing interest for distance and similarity learning. We study here this problem in the situation where few labeled data (and potentially few unlabeled data as well) is available, a situation
- PDF / 823,097 Bytes
- 16 Pages / 439.37 x 666.142 pts Page_size
- 45 Downloads / 247 Views
´ Universit´e Jean Monnet, Laboratoire Hubert Curien, Saint-Etienne, France {Amaury.Habrard,Marc.Sebban}@univ-st-etienne.fr Universit´e Grenoble Alpes, CNRS-LIG/AMA, Saint-Martin-d’H´eres, France {Irina.Nicolae,Eric.Gaussier}@imag.fr
Abstract. The importance of metrics in machine learning has attracted a growing interest for distance and similarity learning. We study here this problem in the situation where few labeled data (and potentially few unlabeled data as well) is available, a situation that arises in several practical contexts. We also provide a complete theoretical analysis of the proposed approach. It is indeed worth noting that the metric learning research field lacks theoretical guarantees that can be expected on the generalization capacity of the classifier associated to a learned metric. The theoretical framework of (, γ, τ )-good similarity functions [1] has been one of the first attempts to draw a link between the properties of a similarity function and those of a linear classifier making use of it. In this paper, we extend this theory to a method where the metric and the separator are jointly learned in a semi-supervised way, setting that has not been explored before, and provide a theoretical analysis of this joint learning via Rademacher complexity. Experiments performed on standard datasets show the benefits of our approach over state-of-theart methods. Keywords: Similarity learning complexity
· (, γ, τ )-good similarity · Rademacher
.
1
Introduction
Many researchers have used the underlying geometry of the data to improve classification algorithms, e.g. by learning Mahanalobis distances instead of the standard Euclidean distance, thus paving the way for a new research area termed metric learning [5,6]. If most of these studies have based their approaches on distance learning [3,9,10,22,24], similarity learning has also attracted a growing interest [2,12,16,20], the rationale being that the cosine similarity should in some cases be preferred over the Euclidean distance. More recently, [1] have proposed a complete framework to relate similarities with a classification algorithm making use of them. This general framework, that can be applied to any c Springer International Publishing Switzerland 2015 A. Appice et al. (Eds.): ECML PKDD 2015, Part I, LNAI 9284, pp. 594–609, 2015. DOI: 10.1007/978-3-319-23528-8 37
Joint Semi-supervised Similarity Learning for Linear Classification
595
bounded similarity function (potentially derived from a distance), provides generalization guarantees on a linear classifier learned from the similarity. Their algorithm does not enforce the positive definiteness constraint of the similarity, like most state-of-the-art methods. However, to enjoy such generalization guarantees, the similarity function is assumed to be known beforehand and to satisfy (, γ, τ )-goodness properties. Unfortunately, [1] do not provide any algorithm for learning such similarities. In order to overcome these limitations, [4] have explored the possibility of independently learning an (, γ,
Data Loading...