On the design of two-stage multiprojection methods for distributed memory systems
- PDF / 2,628,360 Bytes
- 32 Pages / 439.37 x 666.142 pts Page_size
- 93 Downloads / 176 Views
On the design of two‑stage multiprojection methods for distributed memory systems B. E. Moutafis1 · G. A. Gravvanis1 · C. K. Filelis‑Papadopoulos1
© Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract Solving large sparse linear systems, efficiently, on supercomputing infrastructures is a time-consuming component for a wide variety of simulation processes. An effective parallel solver should meet the required specifications, concerning both convergence behavior and scalability. Herewith, a class of two-stage algebraic domain decomposition preconditioning schemes based on the upper Schur complement method is proposed, in order to exploit appropriately distributed memory systems with multicore processors. The design of the method has been focused on homogeneous hybrid parallel systems, i.e., distributed and shared memory systems. However, the proposed method can also be applied to heterogeneous systems, such as cloud infrastructures, or hybrid parallel systems with accelerators, by modifying the workload distribution algorithm and taking into account the different network latencies and bandwidths. The first stage of the proposed schemes is related to the assignment of the subdomains among the workstations of the distributed system, whereas the second stage concerns the further redistribution of the subdomains to each core of a processor. The proposed method utilizes multiprojection techniques, based on semi-aggregated subdomains, leading to improved convergence behavior as the number of subdomains increases. Moreover, a subspace compression technique is used, in order to improve the performance of the preprocessing phase and reduce the memory requirements of the proposed scheme. The preconditioning schemes were combined with a parallel Krylov subspace method, i.e., the parallel preconditioned GMRES(m) method. The convergence behavior, the performance and the scalability of the proposed preconditioning schemes are examined and compared to existing state-of-the-art methods, by conducting several numerical experiments on supercomputing infrastructures. Keywords Domain decomposition · Parallel solver · Semi-aggregation · Highperformance computing
* G. A. Gravvanis [email protected] Extended author information available on the last page of the article
13
Vol.:(0123456789)
B. E. Moutafis et al.
1 Introduction A sparse linear system can be solved by either direct or iterative methods. It is well known that direct methods are robust for solving sparse linear systems; however, in the case of a large sparse linear system, they have excessive memory requirements and require substantial computational work [5, 12, 17]. On the contrary, preconditioned iterative methods based on the Krylov subspace are more efficient for solving large sparse linear systems on supercomputing infrastructures [2, 34, 40]. Domain decomposition methods have been widely used as preconditioning schemes for Krylov subspace methods for solving large sparse linear systems on multicore clusters, due to their inherent paral
Data Loading...