Motif-based spectral clustering of weighted directed networks

  • PDF / 37,728,960 Bytes
  • 41 Pages / 595 x 794 pts Page_size
  • 71 Downloads / 179 Views

DOWNLOAD

REPORT


(2020) 5:62

Applied Network Science

RESEARCH

Open Access

Motif-based spectral clustering of weighted directed networks William G. Underwood1,2* *Correspondence: [email protected]; [email protected] 1 Department of Statistics, University of Oxford, 24–29 St. Giles, Oxford, OX1 3LB, UK 3 The Alan Turing Institute, British Library, 96 Euston Road, London NW1 2DB, UK Full list of author information is available at the end of the article

, Andrew Elliott3,4 and Mihai Cucuringu1,3*

Abstract Clustering is an essential technique for network analysis, with applications in a diverse range of fields. Although spectral clustering is a popular and effective method, it fails to consider higher-order structure and can perform poorly on directed networks. One approach is to capture and cluster higher-order structures using motif adjacency matrices. However, current formulations fail to take edge weights into account, and thus are somewhat limited when weight is a key component of the network under study. We address these shortcomings by exploring motif-based weighted spectral clustering methods. We present new and computationally useful matrix formulae for motif adjacency matrices on weighted networks, which can be used to construct efficient algorithms for any anchored or non-anchored motif on three nodes. In a very sparse regime, our proposed method can handle graphs with a million nodes and tens of millions of edges. We further use our framework to construct a motif-based approach for clustering bipartite networks. We provide comprehensive experimental results, demonstrating (i) the scalability of our approach, (ii) advantages of higher-order clustering on synthetic examples, and (iii) the effectiveness of our techniques on a variety of real world data sets; and compare against several techniques from the literature. We conclude that motif-based spectral clustering is a valuable tool for analysis of directed and bipartite weighted networks, which is also scalable and easy to implement. Keywords: Motif, Spectral clustering, Weighted network, Directed network, Community detection, Graph Laplacian, Bipartite network

1 Introduction Networks are ubiquitous in modern society; from the internet and online blogs to protein interactions and human migration, we are surrounded by inherently connected structures (Kolaczyk and Csárdi 2014). The mathematical and statistical analysis of networks is therefore an important area of modern research, with applications in a diverse range of fields, including biology (Albert 2005), chemistry (Jacob and Lapkin 2018), physics (Newman 2008) and sociology (Adamic and Glance 2005). A common task in network analysis is that of clustering (Schaeffer 2007). Network clustering refers to the partitioning of a network into “clusters”, so that nodes in each cluster are similar (in some sense), while nodes in different clusters are dissimilar. For a review © The Author(s). 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits