The scalability analysis of a parallel tree search algorithm

  • PDF / 338,715 Bytes
  • 16 Pages / 439.37 x 666.142 pts Page_size
  • 47 Downloads / 199 Views

DOWNLOAD

REPORT


The scalability analysis of a parallel tree search algorithm Roman Kolpakov1,2

· Mikhail Posypkin2,3

Received: 16 July 2019 / Accepted: 4 February 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract Increasing the number of computational cores is a primary way of achieving the high performance of contemporary supercomputers. However, developing parallel applications capable to harness the enormous amount of cores is a challenging task. It is very important to understand the principle limitations of the scalability of parallel applications imposed by the algorithm’s structure. The tree search addressed in this paper has an irregular structure unknown prior to computations. That is why such algorithms are challenging for parallel implementation especially on distributed memory systems. In this paper, we propose a parallel tree search algorithm aimed at distributed memory parallel computers. For this parallel algorithm, we analyze its scalability and show that it is close to the theoretical maximum. Keywords Parallel scalability · Parallel efficiency · Complexity analysis · Parallel tree search · Global optimization

1 Introduction The proposed theoretical approach is aimed at measuring the parallel efficiency of search algorithms with a tree-like structure. Many deterministic optimization algorithms (branch-and-bound, branch-and-cut, etc.) fall into this category. We focus on a particular case of such algorithms, when the search tree is stable, not affected by

B

Roman Kolpakov [email protected] Mikhail Posypkin [email protected]

1

Lomonosov Moscow State University, GSP-1, Leninskie Gory, Moscow, Russia

2

Federal Research Center “Computer Science and Control” of Russian Academy of Sciences, Vavilov st. 40, Moscow, Russia

3

National Research University Higher School of Economics, 20 Myasnitskaya Ulitsa, Moscow 101000, Russia

123

R. Kolpakov, M. Posypkin

the parallelization. One important example of such algorithms is proposed in [5] for solving systems of non-linear inequalities. Various parallel implementations of tree search algorithms for shared [4,8,15] and distributed memory multiprocessors [1,6] and GPU accelerators [2,20] were proposed. Smart parallel methods for continuous global optimization problems were proposed in [16,18,19]. Convergence properties of parallel global search algorithms were studies in [7,17]. In [14] the exact lower bound for parallel complexity of so-called frontal parallel branch-and-bound algorithm was obtained. It was shown [13] that scalability significantly depends on the problem coefficients and hence the shape of the search tree. The dependence of the parallel efficiency on the search tree was also observed in [3,21]. The parallel execution time of the proposed algorithm aimed at distributed memory machines is O(T /m + md) where T is the time of solving of the problem by one processor, d is the depth of the search tree, and m is the number of processors. A randomized distributed memory parallel back-track search algorithm proposed in [9] achieves