Pairwise-similarity-based instance reduction for efficient instance selection in multiple-instance learning

  • PDF / 620,575 Bytes
  • 11 Pages / 595.276 x 790.866 pts Page_size
  • 54 Downloads / 276 Views

DOWNLOAD

REPORT


ORIGINAL ARTICLE

Pairwise-similarity-based instance reduction for efficient instance selection in multiple-instance learning Liming Yuan • Jiafeng Liu • Xianglong Tang Daming Shi • Lu Zhao



Received: 18 October 2013 / Accepted: 6 March 2014 Ó Springer-Verlag Berlin Heidelberg 2014

Abstract Unlike the traditional supervised learning, multiple-instance learning (MIL) deals with learning from bags of instances rather than individual instances. Over the last couple of years, some researchers have attempted to solve the MIL problem from the perspective of instance selection. The basic idea is selecting some instance prototypes from the training bags and then converting MIL to single-instance learning using these prototypes. However, a bag is composed of one or more instances, which often leads to high computational complexity for instance selection. In this paper, we propose a simple and general instance reduction method to speed up the instance selection process for various instance selection-based MIL (ISMIL) algorithms. We call it pairwise-similarity-based instance reduction for multiple-instance learning (MIPSIR), which is based on the pairwise similarity between instances in a bag. Instead of the original training bag, we use a pair of instances with the highest or lowest similarity value depending on the bag label within this bag for instance selection. We have applied our method to four effective ISMIL algorithms. The evaluation on three benchmark datasets demonstrates that the MIPSIR method can significantly improve the efficiency of an ISMIL algorithm while maintaining or even improving its generalization capability.

L. Yuan (&)  J. Liu  X. Tang  D. Shi School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China e-mail: [email protected] L. Zhao College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China e-mail: [email protected]

Keywords Multiple-instance learning  Instance selection  Instance reduction  Similarity  Support vector machines

1 Introduction In many practical applications one cannot always obtain the labels of all training examples, which becomes one of the obstacles to use the supervised learning paradigm. Multiple-instance learning (MIL) provides an alternative solution to this problem. Unlike the traditional supervised learning, MIL deals with classifying bags of instances rather than individual instances. Instead of labeled instances, a MIL learner receives labeled bags as training examples. A training bag is considered to be positive if one of its instances is positive and negative if none of its stances are positive. Since the MIL model was coined by Dietterich et al. [1], it has received a great amount of attention from the machine learning community. Up to now the span of applications cover a wide variety of real-world problems such as drug activity prediction [1], stock selection [2], natural scene classification [3], computer aided diagnosis [4, 5], contentbased image retrieval [6–9], action recogn