Parallelization of Molecular-Dynamics Simulations Using Tasks

  • PDF / 1,590,976 Bytes
  • 6 Pages / 612 x 792 pts (letter) Page_size
  • 29 Downloads / 275 Views

DOWNLOAD

REPORT


Parallelization of Molecular-Dynamics Simulations Using Tasks Ralf Meyer1,2 and Chris M. Mangiardi1 1 Department of Mathematics and Computer Science, Laurentian University, Sudbury, ON P3E 2C6, Canada 2 Department of Physics, Laurentian University, Sudbury, ON P3E 2C6, Canada ABSTRACT This article discusses novel algorithms for molecular-dynamics (MD) simulations with short-ranged forces on modern multi- and many-core processors like the Intel Xeon Phi. A taskbased approach to the parallelization of MD on shared-memory computers and a tiling scheme to facilitate the SIMD vectorization of the force calculations is described. The algorithms have been tested with three different potentials and the resulting speed-ups on Intel Xeon Phi coprocessors are shown. INTRODUCTION Molecular dynamics (MD) is one of the most important computational methods in materials science. Although the size of problems that can be studied with MD has increased tremendously thanks to the progress in computing technology many problems remain that can only be tackled with MD if more computational power becomes available. It is therefore important that the computer programs for MD simulations are able to make full use of modern CPUs. Further problems are created since the systems that are studied with MD have not only become larger but also more complex and less homogeneous. This increased complexity leads to load balancing issues that can significantly reduce the computational efficiency. In recent years, the power of computers has been improved mainly by two means: The number of independent compute cores integrated in modern multi-core processors has been increased and single instruction multiple data (SIMD) units with increasingly wide vector registers have been implemented. The latest step in this direction is the Intel Xeon Phi coprocessor, a many-core processor that integrates 60 compute cores with four hardware threads per core and 512 bit wide SIMD units. This article describes our work on the development of new algorithms for MD simulations with short-range forces. The algorithms were specifically designed to be efficient on multi- and many-core processors with wide SIMD vector units like the Xeon Phi. In addition to this, the algorithms avoid certain types of load-balancing problems arising in simulations of inhomogeneous systems. ALGORITHMS Parallel algorithms for molecular dynamics The exploitation of multiple processing cores and SIMD units both require parallel programming albeit in different forms. SIMD units achieve data level parallelism, whereas multicore processors enable task parallelism. MD is inherently a parallel problem since the calculation

of the forces, which usually is the most time consuming operation, can be performed independently for each particle. In practice, however, things are not so simple. Many simulations in materials science use short-ranged force models where the force between two particles is zero if the distance of the particles exceeds a cut-off radius rcut. While short-ranged forces significantly red