Reviewing GPU architectures to build efficient back projection for parallel geometries

  • PDF / 2,655,088 Bytes
  • 43 Pages / 595.276 x 790.866 pts Page_size
  • 64 Downloads / 174 Views

DOWNLOAD

REPORT


ORIGINAL RESEARCH PAPER

Reviewing GPU architectures to build efficient back projection for parallel geometries Suren Chilingaryan1   · Evelina Ametova2,3 · Anreas Kopmann1 · Alessandro Mirone4 Received: 8 October 2018 / Accepted: 10 May 2019 © The Author(s) 2019

Abstract Back-Projection is the major algorithm in Computed Tomography to reconstruct images from a set of recorded projections. It is used for both fast analytical methods and high-quality iterative techniques. X-ray imaging facilities rely on Back-Projection to reconstruct internal structures in material samples and living organisms with high spatial and temporal resolution. Fast image reconstruction is also essential to track and control processes under study in real-time. In this article, we present efficient implementations of the Back-Projection algorithm for parallel hardware. We survey a range of parallel architectures presented by the major hardware vendors during the last 10 years. Similarities and differences between these architectures are analyzed and we highlight how specific features can be used to enhance the reconstruction performance. In particular, we build a performance model to find hardware hotspots and propose several optimizations to balance the load between texture engine, computational and special function units, as well as different types of memory maximizing the utilization of all GPU subsystems in parallel. We further show that targeting architecture-specific features allows one to boost the performance 2–7 times compared to the current state-of-the-art algorithms used in standard reconstructions codes. The suggested load-balancing approach is not limited to the back-projection but can be used as a general optimization strategy for implementing parallel algorithms. Keywords  Parallel algorithms · Hardware architecture · GPU computing · Synchrotron tomography · Back-projection · CUDA · OpenCL

1 Introduction X-ray tomography is a powerful tool to investigate materials and small animals at the micro- and nano-scale [1]. Information about X-ray attenuation or/and phase changes in the sample is used to reconstruct its internal structure. Recent * Suren Chilingaryan [email protected] Evelina Ametova [email protected] Anreas Kopmann [email protected] Alessandro Mirone [email protected] 1



Karlsruhe Institute of Technology, Karlsruhe, Germany

2



KU Leuven, Leuven, Belgium

3

The University of Manchester, Manchester, UK

4

ESRF, Grenoble, France



advances in X-ray optics and detector technology have paved the way for a variety of new X-ray imaging experiments aiming to study dynamic processes in materials and to analyze small organisms in vivo. At the Swiss Light Source (SLS) scientists were able to take high quality 3D snapshots of 150 Hz oscillations of a blowfly flight motor [2]. A temporal resolution of 20 ms was achieved during a stencil test performed at SLS [3] and also in the analysis of morphological dynamics of fast-moving weevils at the ANKA synchrotron at KIT [4]. To achieve these results, the inst