A Comparison of Parallel Solvers for Diagonally Dominant and General Narrow-Banded Linear Systems II

We continue the comparison of parallel algorithms for solving diagonally dominant and general narrow-banded linear systems of equations that we started in [2]. The solvers compared are the banded system solvers of ScaLAPACK [6] and those investigated by A

  • PDF / 159,293 Bytes
  • 10 Pages / 431 x 666 pts Page_size
  • 58 Downloads / 173 Views

DOWNLOAD

REPORT


Institute of Scientific Computing, ETH Zurich [email protected] 2 Center for Applied Scientific Computing, Lawrence Livermore National Laboratory [email protected] 3 Department of Computer Science, University of Tennessee, Knoxville [email protected] 4 Computer Sciences Laboratory, RSISE, Australian National University, Canberra [email protected]

Abstract. We continue the comparison of parallel algorithms for solving diagonally dominant and general narrow-banded linear systems of equations that we started in [2]. The solvers compared are the banded system solvers of ScaLAPACK [6] and those investigated by Arbenz and Hegland [1, 5]. We present the numerical experiments that we conducted on the IBM SP/2.

1

Introduction

In this note we continue the comparison of direct parallel solvers for narrowbanded systems of linear equations Ax = b

(1)

that we started in [2]. The n-by-n matrix A has a narrow band if its lower halfbandwidth kl and upper half-bandwidth ku are much smaller than the order of A, kl + ku  n. We separately compare implementations of an algorithm for solving diagonally dominant and of an algorithm for solving arbitrary band systems. The algorithm for the diagonally dominant band system can be interpreted as a generalization of the well known tridiagonal cyclic reduction (CR), or more usefully, as Gaussian elimination applied to a symmetrically permuted system of equations (P AP T )P x = P b. The latter interpretation has important consequences, such as it implies that the algorithm is backward stable [4]. The permutation enhances (coarse grain) parallelism. Unfortunately, it also causes Gaussian elimination to generate fill-in which in turn increases the computational complexity as well as the memory requirements of the algorithm [1, 6]. The algorithm for the arbitrary band system can be interpreted as a generalization of bidiagonal CR [10] which is equivalent to Gaussian elimination applied P. Amestoy et al. (Eds.): Euro-Par’99, LNCS 1685, pp. 1078–1087, 1999. c Springer-Verlag Berlin Heidelberg 1999

Parallel Solvers for Narrow-Banded Linear Systems

1079

to a nonsymmetrically permuted system of equations (P AQT )Qx = P b. Here, the right permutation Q enhances parallelism, while the left permutation P enhances stability in that it incorporates the row-exchanges caused by pivoting. Recently, the authors presented experiments with implementations of these algorithms using up to 128 processors of an Intel Paragon [2]. In this paper we complement those comparisons of the ScaLAPACK implementations with the experimental implementations by Arbenz and Hegland [1, 5] by timings on the IBM SP/2 at ETH Zurich. In section 2 we present our results for the diagonally dominant case. In section 3 the case of arbitrary band matrices is discussed. We draw our conclusions in section 4.

2

Experiments with the Band Solver for Diagonally Dominant Systems

An n-by-n diagonally dominant band matrix is split according to  A1 B1U   B1L C1 D2U   L U   D A B 2 2 2   A= , .. .. ..   . . .   L