Multi-view Semantic Learning for Data Representation

Many real-world datasets are represented by multiple features or modalities which often provide compatible and complementary information to each other. In order to obtain a good data representation that synthesizes multiple features, researchers have prop

  • PDF / 275,117 Bytes
  • 16 Pages / 439.37 x 666.142 pts Page_size
  • 104 Downloads / 225 Views

DOWNLOAD

REPORT


College of the Information and Technology, Northwest University of China, Xi’an, China [email protected], {pjy,ziyuguan}@nwu.edu.cn 2 Department of Computer Science, University of North Carolina at Charlotte, Charlotte, NC, USA [email protected]

Abstract. Many real-world datasets are represented by multiple features or modalities which often provide compatible and complementary information to each other. In order to obtain a good data representation that synthesizes multiple features, researchers have proposed different multi-view subspace learning algorithms. Although label information has been exploited for guiding multi-view subspace learning, previous approaches either fail to directly capture the semantic relations between labeled items or unrealistically make Gaussian assumption about data distribution. In this paper, we propose a new multi-view nonnegative subspace learning algorithm called Multi-view Semantic Learning (MvSL). MvSL tries to capture the semantic structure of multi-view data by a novel graph embedding framework. The key idea is to let neighboring intra-class items be near each other while keep nearest inter-class items away from each other in the learned common subspace across multiple views. This nonparametric scheme can better model non-Gaussian data. To assess nearest neighbors in the multi-view context, we develop a multiple kernel learning method for obtaining an optimal kernel combination from multiple features. In addition, we encourage each latent dimension to be associated with a subset of views via sparseness constraints. In this way, MvSL is able to capture flexible conceptual patterns hidden in multi-view features. Experiments on two real-world datasets demonstrate the effectiveness of the proposed algorithm. Keywords: Multi-view learning · Nonnegative matrix factorization Graph embedding · Multiple kernel lerning · Structured sparsity

1

·

Introduction

In many real-world data analytic problems, instances are often described with multiple modalities or views. It becomes natural to integrate multi-view representations to obtain better performance than relying on a single view. A good c Springer International Publishing Switzerland 2015  A. Appice et al. (Eds.): ECML PKDD 2015, Part I, LNAI 9284, pp. 367–382, 2015. DOI: 10.1007/978-3-319-23528-8 23

368

P. Luo et al.

integration of multi-view features can lead to a more comprehensive description of the data items, which could improve performance of many related applications. An emerging area of multi-view learning is multi-view latent subspace learning, which aims to obtain a compact latent representation by taking advantage of inherent structure and relation across multiple views. A pioneering technique in this area is Canonical Correlation Analysis (CCA) [7], which tries to learn the projections of two views so that the correlation between them is maximized. Recently, a lot of methods have been applied to multi-view subspace learning, such as matrix factorization [9], [4], [11], [18], graphical models [3] and spectral embeddin