Graph algorithms: parallelization and scalability
- PDF / 1,418,613 Bytes
- 21 Pages / 595 x 842 pts (A4) Page_size
- 20 Downloads / 258 Views
. POSITION PAPER .
October 2020, Vol. 63 203101:1–203101:21 https://doi.org/10.1007/s11432-020-2952-7
Graph algorithms: parallelization and scalability Wenfei FAN1,2,3* , Kun HE2,4 , Qian LI2,4 & Yue WANG2 1 School of Informatics, University of Edinburgh, Edinburgh EH8 9AB, UK; Shenzhen Institute of Computing Sciences, Shenzhen University, Shenzhen 518060, China; 3 Beijing Advanced Innovation Center for Big Data and Brain Computing, Beihang University, Beijing 100191, China; 4 Guangdong Province Key Laboratory of Popular High Performance Computers, Shenzhen University, Shenzhen 518060, China 2
Received 1 April 2020/Revised 9 May 2020/Accepted 10 June 2020/Published online 21 September 2020
Abstract For computations on large-scale graphs, one often resorts to parallel algorithms. However, parallel algorithms are difficult to write, debug and analyze. Worse still, it is difficult to make algorithms parallelly scalable, such that the more machines are used, the faster the algorithms run. Indeed, it is not yet known whether any PTIME computational problems admit parallelly scalable algorithms on shared-nothing systems. Is it possible to parallelize sequential graph algorithms and guarantee convergence at the correct results as long as the sequential algorithms are correct? Moreover, does a PTIME parallelly scalable problem exist on shared-nothing systems? This position paper answers both questions in the affirmative. Keywords
parallelization, parallel scalability, PTIME problems, graph algorithms, shared-nothing systems
Citation Fan W F, He K, Li Q, et al. Graph algorithms: parallelization and scalability. Sci China Inf Sci, 2020, 63(10): 203101, https://doi.org/10.1007/s11432-020-2952-7
1
Introduction
When processing large-scale graphs, parallel computations are often a must. Several parallel systems have been developed for graph computations, e.g., Pregel [1], PowerGraph [2], and Giraph++ [3]. However, to make effective use of parallel graph computations, at least two issues have to be addressed. The first issue concerns the programming models of parallel graph systems. Parallel algorithms are difficult to write, debug, and analyze [4]. On the one hand, a large number of sequential (single-machine) graph algorithms are already in place. On the other hand, to use Pregel, for instance, one has to “think like a vertex” and recast the existing algorithms into a vertex-centric model; similarly when programming with other parallel graph systems. The recasting is nontrivial for people who are not very familiar with parallel models. This makes these parallel systems a privilege for experienced users. The second issue involves the effectiveness of parallel algorithms. When we turn to a parallel graph system, we assume that when provided with more processors (computing nodes), we can make effective use of the additional resources and speed up the execution of our algorithms. However, is this assumption always correct? Parallel systems typically adopt the shared-nothing architecture nowadays. On such a system, the use
Data Loading...