Local Partitioning for Directed Graphs Using PageRank

A local partitioning algorithm finds a set with small conductance near a specified seed vertex. In this paper, we present a generalization of a local partitioning algorithm for undirected graphs to strongly connected directed graphs. In particular, we pro

  • PDF / 429,260 Bytes
  • 13 Pages / 430 x 660 pts Page_size
  • 10 Downloads / 197 Views

DOWNLOAD

REPORT


2

Microsoft Research, Redmond WA 98052 [email protected] University of California, San Diego, La Jolla CA 92093-0112 [email protected] 3 Yahoo! Research, Santa Clara CA 95054 [email protected]

Abstract. A local partitioning algorithm finds a set with small conductance near a specified seed vertex. In this paper, we present a generalization of a local partitioning algorithm for undirected graphs to strongly connected directed graphs. In particular, we prove that by computing a personalized PageRank vector in a directed graph, starting from a single seed vertex within a set S that has conductance at most α, and by performing a sweep over that vector, we can obtain a set of vertices S  with   conductance ΦM (S ) = O( α log |S|). Here, the conductance function ΦM is defined in terms of the stationary distribution of a random walk in the directed graph. In addition, we describe how this algorithm may be applied to the PageRank Markov chain of an arbitrary directed graph, which provides a way to partition directed graphs that are not strongly connected.

1

Introduction

In directed networks like the world wide web, it is critical to develop algorithms that utilize the additional information conveyed by the direction of the links. Algorithms for web crawling, web mining, and search ranking, all depend heavily on the directedness of the graph. For the problem of graph partitioning, it is extremely challenging to develop algorithms that effectively utilize the directed links. Spectral algorithms for graph partitioning have natural obstacles for generalizations to directed graphs. Nonsymmetric matrices do not have a spectral decomposition, meaning there does not necessarily exist an orthonormal basis of eigenvectors. The stationary distribution for random walks on directed graphs is no longer determined by the degree sequences. In the earlier work of Fill [7] and Mihail [12], several generalizations for directed graphs were examined for regular graphs. Lov´ asz and Simonovits [11] established a bound for the mixing rate of an asymmetric ergodic Markov chain in terms of its conductance. When applied to the Markov chain of a random walk in a strongly connected directed graph, their results can be used to identify a set of states of the Markov chain with small conductance. Algorithms for finding sparse cuts, based on linear and semidefinite programming and metric embeddings, have also been generalized to A. Bonato and F.R.K. Chung (Eds.): WAW 2007, LNCS 4863, pp. 166–178, 2007. c Springer-Verlag Berlin Heidelberg 2007 

Local Partitioning for Directed Graphs Using PageRank

167

directed graphs [3,6]. A Cheeger inequality for directed graphs which relies on the eigenvalues of a normalized Laplacian for directed graphs can also be used to find cuts of small conductance [5]. This paper is concerned with a different type of partitioning algorithm, called a local partitioning algorithm. A local partitioning algorithm finds a set with small conductance near a specified seed vertex, and can produce such a cut by examining only a small