Algorithmic Skeletons and Parallel Design Patterns in Mainstream Parallel Programming

  • PDF / 710,568 Bytes
  • 22 Pages / 439.37 x 666.142 pts Page_size
  • 18 Downloads / 215 Views

DOWNLOAD

REPORT


Algorithmic Skeletons and Parallel Design Patterns in Mainstream Parallel Programming Marco Danelutto1 • Gabriele Mencagli1 • Massimo Torquati1 Horacio Gonza´lez–Ve´lez2 • Peter Kilpatrick3



Received: 13 August 2019 / Accepted: 8 October 2020  The Author(s) 2020

Abstract This paper discusses the impact of structured parallel programming methodologies in state-of-the-art industrial and research parallel programming frameworks. We first recap the main ideas underpinning structured parallel programming models and then present the concepts of algorithmic skeletons and parallel design patterns. We then discuss how such concepts have permeated the wider parallel programming community. Finally, we give our personal overview—as researchers active for more than two decades in the parallel programming models and frameworks area—of the process that led to the adoption of these concepts in state-of-the-art industrial and research parallel programming frameworks, and the perspectives they open in relation to the exploitation of forthcoming massively-parallel (both general and special-purpose) architectures. Keywords Algorithmic skeletons  Parallel design patterns  High performance computing  Multi-core architecture  Parallel computing

1 Introduction In the last two decades, the number of parallel architectures available to the masses has substantially increased. The world has moved from clusters/networks of workstations—composed of individual nodes with a small number of CPUs sharing a common memory hierarchy—to the ubiquitous presence of multi-core CPUs coupled with different many-core accelerators, typically interconnected through high-bandwidth, low-latency networks. As a result, parallel application programmers now face the challenge of targeting hundreds of hardware-thread contexts, possibly associated to thousands of GP-GPU cores or, for top500-class architectures, millions of cores. These hardware features exacerbate the ‘‘software gap,’’ as Extended author information available on the last page of the article

123

International Journal of Parallel Programming

they present more substantial challenges to skilled application programmers and to programming framework developers. The need for programming models and programming frameworks to ease the task of parallel application programmers is therefore acute. A number of ‘‘de facto standards’’—OpenMP for shared memory architectures, CUDA and OpenCL for GP-GPUs, MPI for distributed clusters—are widely recognised. Furthermore, other higher-level frameworks such as IntelTBB or Microsoft PPL have been recognised as emerging touchstones. Arguably, such frameworks build upon—and to different degrees, recognise roots in—research results from structured parallel programming. Our chief contribution in this paper is to provide an outline of the main results from algorithmic skeletons and parallel design patterns that have been migrated to industrial-strength parallel programming frameworks. They have arguably contributed to the acceptance and success of these frameworks al