Structure learning for weighted networks based on Bayesian nonparametric models

  • PDF / 1,102,361 Bytes
  • 11 Pages / 595.276 x 790.866 pts Page_size
  • 71 Downloads / 197 Views

DOWNLOAD

REPORT


ORIGINAL ARTICLE

Structure learning for weighted networks based on Bayesian nonparametric models Xiaojuan Jiang1 • Wensheng Zhang1

Received: 10 February 2015 / Accepted: 30 September 2015  Springer-Verlag Berlin Heidelberg 2015

Abstract With the increase of availability and scope of complex networks, structure learning for networks has received an enormous amount of interest in many fields, including physics, computer and information sciences, biology and the social sciences. To extract compact and flexible representations for weighted networks, we propose a new Bayesian nonparametric model to learn from both the existence and weight of interactions between nodes. Our model adopts Dirichlet process prior to automatically infer the partition over nodes in weighted networks without specifying the number of clusters. This is vital for structure discovery in complex networks, especially for novel domains where we have little prior knowledge. We develop a mean-field variational algorithm to efficiently approximate the model’s posterior distribution over infinite latent clusters. Conducting extensive experiments on synthetic data set and four popular data sets, we demonstrate that our model can effectively capture the latent structure for complex weighted networks. Keywords Structure learning  Clustering  Probabilistic graph models  Bayesian nonparametric models  Variational inference

& Wensheng Zhang [email protected] Xiaojuan Jiang [email protected] 1

Institute of Automation, University of Chinese Academy of Sciences, Beijing 100190, People’s Republic of China

1 Introduction Statistical analysis of complex networks has been an active area of research for decades, and is becoming an increasingly important challenge in pattern recognition and machine learning [12, 30]. Consisting of pairwise measurements, such as existence or absence of links between pairs of objects, networks have been used to analyze interpersonal social relationships [30], communication networks [31], academic paper co-authorships and citations [33], protein interactions [25], gene regulatory patterns [17], and much more [12]. Unlike traditional attribute data collected from individual objects, the observations in networks are no longer independent or exchangeable because objects are pairwise related. Independence or exchangeability is a key assumption made in machine learning and statistics for traditional attribute data [3, 6]. This intrinsic difference in structure requires special treatments for network data. A central problem in the network literature is to uncover the latent structure based on the observed pairwise interactions between objects [11, 12]. Among all the statistical models proposed for this end, Stochastic Block Model (SBM) [15, 31, 41] is an elegant probabilistic graph model of block structure in unweighted networks. Probabilistic graph models [39] are perfect integration of probability theory and graph theory. They provide a natural tool for dealing with uncertainty that occurs throughout applied mathematics