An Efficient Parallel Load-Balancing Framework for Orthogonal Decomposition of Geometrical Data

The accurate subdivision of spatially organized datasets is a complex problem in computer science but specifically important for load balancing in parallel environments. The problem is to (a) find a partitioning where each partition has the same number of

PDF / 586,347 Bytes
17 Pages / 439.37 x 666.142 pts Page_size
4 Downloads / 211 Views

DOWNLOAD

REPORT

Blue Brain Project, Ecole Polytechnique F´ed´erale de Lausanne (EPFL), Biotech Campus, 1202 Geneva, Switzerland [email protected] 2 Data-Intensive Applications and Systems Laboratory, ´ Ecole Polytechnique F´ed´erale de Lausanne (EPFL), 1015 Lausanne, Switzerland

Abstract. The accurate subdivision of spatially organized datasets is a complex problem in computer science but speciﬁcally important for load balancing in parallel environments. The problem is to (a) ﬁnd a partitioning where each partition has the same number of elements and (b) the communication between partitions (duplicate members) is minimized. We present a novel parallel load-balancing framework — Sort Balance Split (SBS) — the ﬁrst to our knowledge to perform accurate parallel partitioning of multidimensional data, while requiring a ﬁxed number of communication steps independent of network size or input data distribution. When compared to the state of the art sampling and parallel partitioning methods adopted by HPC problems, it delivers better load balancing on a shorter time to solution. We analyse four partitioning schemes that SBS can be applied to, and evaluated our method on 4096 nodes of an IBM BlueGene/Q supercomputer partitioning up to 1 trillion elements, and exhibiting almost-linear scaling properties. Keywords: Geometric partitioning · Spatial partitioning bisection · Jagged partitioning · Load balancing

1

·

Recursive

Introduction

Geometrical domain decomposition has been widely applied in several scientiﬁc ﬁelds. An accurate decomposition is important, as the best performance on a network of computing nodes is achieved with good data distribution across all processing entities and by the reduction of intra-network communication. In the case where we have a parallel and homogeneous architecture, if the processing time for each element is equal, the best load balancing has an even distribution across all processing entities. Most domain partitioning methods of static geometrical data rely either on connection or geometry-based decompositions. c Springer International Publishing Switzerland 2016 J.M. Kunkel et al. (Eds.): ISC High Performance 2016, LNCS 9697, pp. 81–97, 2016. DOI: 10.1007/978-3-319-41321-1 5

82

B.R.C. Magalh˜ aes et al.

Connectivity-based partitioning is based on the analysis of the connectivity of the elements and aims to equalize the weight of edges or nodes via graph or hypergraph partitioning [17]. A hypergraph is a generalization of a graph where edges can connect more than two vertices and are called hyperedges. Some hypergraph partitioning applications are numerical linear algebra [5], integrated circuit design [14] and web document categorization [3]. Scotch [18], METIS and ParMETIS [13] are the most commonly used large-scale graph partitioners. For hypergraph partitioning, Zoltan [4] is the most commonly used toolkit. Geometry-based partitioning relies on the principle that spatially placed elements require neighbouring elements’ information for the processing of data. Such methods tend to divide t

Data Loading...

An Efficient Parallel Load-Balancing Framework for Orthogonal Decomposition of Geometrical Data

Recommend Documents

An efficient ACO-PSO-based framework for data classification and preprocessing in big data

An Elastic Anonymization Framework for Open Data

An Elastic Data Conversion Framework for Data Integration System

An Efficient Training Framework for Reversible Neural Architectures

PDKE: An Efficient Distributed Embedding Framework for Large Knowledge Graphs

An Efficient Framework for Text Document Security and Privacy

An efficient memetic genetic programming framework for symbolic regression

An energy-efficient load distribution framework for SDN controllers

Solving the Problem of Decomposition of an Orthogonal Polyhedron of Arbitrary Dimension

EPUR: An Efficient Parallel Update System over Large-Scale RDF Data

Alternating maximization: unifying framework for 8 sparse PCA formulations and efficient parallel codes

Application of Proper Orthogonal Decomposition in Concrete Performance Appraisal