An Efficient Parallel Load-Balancing Framework for Orthogonal Decomposition of Geometrical Data

The accurate subdivision of spatially organized datasets is a complex problem in computer science but specifically important for load balancing in parallel environments. The problem is to (a) find a partitioning where each partition has the same number of

  • PDF / 586,347 Bytes
  • 17 Pages / 439.37 x 666.142 pts Page_size
  • 4 Downloads / 179 Views

DOWNLOAD

REPORT


Blue Brain Project, Ecole Polytechnique F´ed´erale de Lausanne (EPFL), Biotech Campus, 1202 Geneva, Switzerland [email protected] 2 Data-Intensive Applications and Systems Laboratory, ´ Ecole Polytechnique F´ed´erale de Lausanne (EPFL), 1015 Lausanne, Switzerland

Abstract. The accurate subdivision of spatially organized datasets is a complex problem in computer science but specifically important for load balancing in parallel environments. The problem is to (a) find a partitioning where each partition has the same number of elements and (b) the communication between partitions (duplicate members) is minimized. We present a novel parallel load-balancing framework — Sort Balance Split (SBS) — the first to our knowledge to perform accurate parallel partitioning of multidimensional data, while requiring a fixed number of communication steps independent of network size or input data distribution. When compared to the state of the art sampling and parallel partitioning methods adopted by HPC problems, it delivers better load balancing on a shorter time to solution. We analyse four partitioning schemes that SBS can be applied to, and evaluated our method on 4096 nodes of an IBM BlueGene/Q supercomputer partitioning up to 1 trillion elements, and exhibiting almost-linear scaling properties. Keywords: Geometric partitioning · Spatial partitioning bisection · Jagged partitioning · Load balancing

1

·

Recursive

Introduction

Geometrical domain decomposition has been widely applied in several scientific fields. An accurate decomposition is important, as the best performance on a network of computing nodes is achieved with good data distribution across all processing entities and by the reduction of intra-network communication. In the case where we have a parallel and homogeneous architecture, if the processing time for each element is equal, the best load balancing has an even distribution across all processing entities. Most domain partitioning methods of static geometrical data rely either on connection or geometry-based decompositions. c Springer International Publishing Switzerland 2016  J.M. Kunkel et al. (Eds.): ISC High Performance 2016, LNCS 9697, pp. 81–97, 2016. DOI: 10.1007/978-3-319-41321-1 5

82

B.R.C. Magalh˜ aes et al.

Connectivity-based partitioning is based on the analysis of the connectivity of the elements and aims to equalize the weight of edges or nodes via graph or hypergraph partitioning [17]. A hypergraph is a generalization of a graph where edges can connect more than two vertices and are called hyperedges. Some hypergraph partitioning applications are numerical linear algebra [5], integrated circuit design [14] and web document categorization [3]. Scotch [18], METIS and ParMETIS [13] are the most commonly used large-scale graph partitioners. For hypergraph partitioning, Zoltan [4] is the most commonly used toolkit. Geometry-based partitioning relies on the principle that spatially placed elements require neighbouring elements’ information for the processing of data. Such methods tend to divide t