A numerical evaluation of the accuracy of influence maximization algorithms

  • PDF / 1,373,441 Bytes
  • 10 Pages / 595.276 x 790.866 pts Page_size
  • 8 Downloads / 178 Views

DOWNLOAD

REPORT


ORIGINAL ARTICLE

A numerical evaluation of the accuracy of influence maximization algorithms Hautahi Kingi1   · Li‑An Daniel Wang2 · Tom Shafer3 · Minh Huynh4 · Mike Trinh5 · Aaron Heuser8 · George Rochester6 · Antonio Paredes7 Received: 26 March 2020 / Revised: 22 July 2020 / Accepted: 24 July 2020 © Springer-Verlag GmbH Austria, part of Springer Nature 2020

Abstract We develop an algorithm to compute exact solutions to the influence maximization problem using concepts from reverse influence sampling (RIS). We implement the algorithm using GPU resources to evaluate the empirical accuracy of theoretically guaranteed greedy and RIS approximate solutions. We find that the approximations yield solutions that are remarkably close to optimal—usually achieving greater than 99% of the optimal influence spread. This accuracy is consistent across a wide range of network structures. Keywords  Networks · Influence maximization · GPU computing

1 Introduction Influence maximization (IM) in networks is a well-studied problem with applications in viral marketing, epidemiology, and public health among many others. The basic problem attempts to select a set of initial seed nodes in a network to maximize the resulting expected spread of influence initiated at those nodes. Finding a solution to this problem suffers from a combinatorial explosion in the number of candidate seed sets as the size of the network or seed set increases. For example, in a relatively small network of 1000 nodes, there are approximately 8 trillion seed set candidates of size 5. The

large computational burden involved in solving this NP-hard problem prompted substantial research over the last decadeand-a-half aimed at developing approximate solutions. These approximations can broadly be classified into two groups— theoretically guaranteed algorithms that are proven to obtain solutions within a given factor of the optimal solution and heuristics that sacrifice theoretical guarantees for speed. This article focuses on the former. Theoretically guaranteed approximation algorithms (Kempe et  al. 2003; Leskovec et  al. 2007; Goyal et  al. 2011; Borgs et  al. 2014; Nguyen et al. 2016; Tang et al. 2018) provide solutions that achieve an expected spread of influence within a factor of (1 − 1∕e) ≈ 0.63 of the optimal solution with a

* Hautahi Kingi [email protected]

Antonio Paredes [email protected]

Li‑An Daniel Wang [email protected]

1



Facebook Inc, New York, NY, USA

Tom Shafer [email protected]

2



Sam Houston State University, Huntsville, TX, USA

3



Elder Research, Inc, Raleigh, NC, USA

4



IMPAQ International, Washington DC, USA

5

Mike Trinh [email protected]



IMPAQ International, Boston, MA, USA

6



Northeastern University, Boston, MA, USA

Aaron Heuser [email protected]

7



U.S. Food and Drug Administration, Silverspring, MD, USA

8



IMPAQ International, Columbia, MD, USA

Minh Huynh [email protected]

George Rochester [email protected]

13

Vol.:(0123456789)

70  

Page 2 of 10

given p