Robust Classification of Information Networks by Consistent Graph Learning

Graph regularization-based methods have achieved great success for network classification by making the label-link consistency assumption, i.e., if two nodes are linked together, they are likely to belong to the same class. However, in a real-world networ

  • PDF / 313,844 Bytes
  • 16 Pages / 439.37 x 666.142 pts Page_size
  • 16 Downloads / 167 Views

DOWNLOAD

REPORT


Department of Computer Science, University of Illinois at Urbana-Champaign, Champaign, IL, USA {shizhi2,hanj}@illinois.edu 2 Department of Systems and Information Engineering, University of Virginia, Charlottesville, VA, USA [email protected]

Abstract. Graph regularization-based methods have achieved great success for network classification by making the label-link consistency assumption, i.e., if two nodes are linked together, they are likely to belong to the same class. However, in a real-world network, there exist links that connect nodes of different classes. These inconsistent links raise a big challenge for graph regularization and deteriorate the classification performance significantly. To address this problem, we propose a novel algorithm, namely Consistent Graph Learning, which is robust to the inconsistent links of a network. In particular, given a network and a small number of labeled nodes, we aim at learning a consistent network with more consistent and fewer inconsistent links than the original network. Since the link information of a network is naturally represented by a set of relation matrices, the learning of a consistent network is reduced to learning consistent relation matrices under some constraints. More specifically, we achieve it by joint graph regularization on the nuclear norm minimization of consistent relation matrices together with 1 -norm minimization on the difference matrices between the original relation matrices and the learned consistent ones subject to certain constraints. Experiments on both homogeneous and heterogeneous network datasets show that the proposed method outperforms the state-of-the-art methods. Keywords: Robust classification · Information network link · Consistent network · Consistent Graph Learning

1

·

Consistent

Introduction

Information networks have been found to play increasingly important role in real-life applications. Generally speaking, information networks can be categorized into two families: (1) homogeneous information networks where there is only one type of nodes and links. Examples include friendship network in Facebook1 , co-author and citation network in DBLP2 , and the World Wide Web; 1 2

http://www.facebook.com http://www.informatik.uni-trier.de/∼ley/db/

c Springer International Publishing Switzerland 2015  A. Appice et al. (Eds.): ECML PKDD 2015, Part II, LNAI 9285, pp. 752–767, 2015. DOI: 10.1007/978-3-319-23525-7 46

Robust Classification of Information Networks by Consistent Graph Learning

753

and (2) heterogeneous information networks where there exist multiple types of nodes and links. A bibliographic information network is an example of heterogeneous information network, which contains four types of objects: papers, authors, conferences and terms. Papers and authors are linked by the relation of “written by” and “write”. Papers and conferences are linked by “published in” and “publish”. Papers and terms are linked by “contain” and “contained in”. In the past decade, many methods have been proposed for classification of both homogene