Accelerating a Massively Parallel Numerical Simulation in Electromagnetism Using a Cluster of GPUs
We have accelerated a legacy massively parallel code solving 3D Maxwell’s equations on a hybrid cluster enhanced with GPUs. To minimize the impact on our existing code, we combine its original Full-MPI approach with task parallelism to design an efficient
- PDF / 288,729 Bytes
- 10 Pages / 439.37 x 666.142 pts Page_size
- 35 Downloads / 212 Views
Abstract. We have accelerated a legacy massively parallel code solving 3D Maxwell’s equations on a hybrid cluster enhanced with GPUs. To minimize the impact on our existing code, we combine its original FullMPI approach with task parallelism to design an efficient accelerated LLt solver that efficiently shares the same GPUs between different processes and relies on an optimized communication patterns. On 180 nodes of the Tera100 cluster, our GPU-accelerated LLt decomposition reaches 80 TFlop/s on a problem with 247980 unknowns, whereas the sustained machine’s CPU and GPU peaks are respectively 13 and 153 TFlop/s. Keywords: GPU · Dense linear algebra cation · Electromagnetism
1
·
Cluster computing
·
Appli-
Context
Accelerators are a promising way to build powerful energy-efficient machines. Transitioning to these architectures is however a significant challenge, especially for legacy codes. In this paper, we consider a massively parallel 3D electromagnetic Full-MPI code that requires a tremendous amount of processing resources, and show how we have modified it to exploit a large cluster enhanced with GPUs. This optimized production code was written in FORTRAN since the 90s, so we had to adopt a suitable porting methodology, based on pragmatic constraints. We have followed a gradual porting methodology with a limited impact on our code, and which is flexible enough to be adapted to other architectures in the future (e.g. Intel Xeon Phi processors). 1.1
Physical Problem
The design of stealthy objects requires the computation of the Radar Cross Section (or RCS) of complex 3D targets with complex coatings. The RCS is defined as the ratio between reflected and incident energy in a specific direction. This implies numerically solving Maxwell’s equations with the harmonic hypothesis into penetrable bodies and the unbounded surrounding free space. Objects R. Wyrzykowski et al. (Eds.): PPAM 2013, Part I, LNCS 8384, pp. 593–602, 2014. c Springer-Verlag Berlin Heidelberg 2014 DOI: 10.1007/978-3-642-55224-3 55,
594
C. Augonnet et al.
(a) Currents (at 2.6 GHz) (b) RCS (at 8 GHz) Fig. 1. An example of RCS computation on NASA almond object
can be composed with both conducting and dielectric bodies. The problem consists in numerically solving Maxwell’s equations. The electric and magnetic fields at any point in space can be expressed in terms of surface integral of the charges and currents induced on the surface of the body, as shown on Fig. 1(a). This problem is discretized using the finite element method, so that we can approximate the electric current at the surface of the domain by solving a complex symmetric (non hermitian) dense linear system AX = B. On Fig. 1(b), we depict the ratio of energy reflected in each direction due to the currents we have computed at the surface of the objects. We here consider stealth objects with really low RCS, so double precision is required to achieve a sufficient accuracy. Besides, a direct method is used because iterative solvers would not be suitable with many right-hande sides. With the time h
Data Loading...