Graph based semi-supervised learning via label fitting

  • PDF / 635,413 Bytes
  • 13 Pages / 595.276 x 790.866 pts Page_size
  • 104 Downloads / 174 Views

DOWNLOAD

REPORT


ORIGINAL ARTICLE

Graph based semi-supervised learning via label fitting Weiya Ren1 • Guohui Li1

Received: 8 April 2015 / Accepted: 29 October 2015  Springer-Verlag Berlin Heidelberg 2015

Abstract The global smoothness and the local label fitting are two key issues for estimating the function on the graph in graph based semi-supervised learning (GSSL). The unsupervised normalized cut method can provide a more reasonable criterion for learning the global smoothness of the data than classic GSSL methods. However, the semi-supervised norm of the normalized cut, which is a NP-hard problem, has not been studied well. In this paper, a new GSSL framework is proposed by extending normalized cut to its semi-supervised norm. The NP-hard semi-supervised normalized cut problem is innovatively solved by effective algorithms. In addition, we can design more reasonable local label fitting terms than conventional GSSL methods. Other graph cut methods are also investigated to extend the proposed semi-supervised learning algorithms. Furthermore, we incorporate the nonnegative matrix factorization with the proposed learning algorithms to solve the out-of-sample problem in semi-supervised learning. Solutions obtained by the proposed algorithms are sparse, nonnegative and congruent with unit matrix. Experiment results on several real benchmark datasets indicate that the proposed algorithms achieve good results compared with state-of-art methods.

& Weiya Ren [email protected] Guohui Li [email protected] 1

College of Information System and Management, National University of Defense Technology, Changsha 410072, People’s Republic of China

Keywords Graph based semi-supervised learning  Global smoothness  Label fitting  Graph cut  Basis matrix  Congruency approximation

1 Introduction In the past several years, the semi-supervised learning (SSL) approach, which combines limited labeled samples with rich unlabeled samples to improve learning ability, has attracted lots of attention [1–6]. As an important branch of SSL, graph based semi-supervised learning (GSSL) [7– 12] has recently become popular in wide applications due to their high accuracy and computational efficiency. Its application areas include image annotation [13, 14], collective image parsing [15] and medical diagnosis [16]. Some researches focus on the graph construction [17–19] in GSSL, while others focus on the propagation strategy, such as Gaussian fields and harmonic functions (GFHF) [9], local and global consistency (LGC) [7], greedy gradient max-cut (GGMC) [8], and manifold regularization [20]. Specifically, LGC and GFHF treat the soft label matrix as the only variable in optimization, while GGMC solves a bivariate optimization problem over the predicted soft labels and the initial hard labels. GSSL is also one kind of graph-based learning (GL), which treats samples from a data set as vertices in a graph and builds pairwise weights between these vertices. Global smoothness and local label fitting are two key issues in GSSL [7, 20, 21]. For global smoothness