Performance and Energy Analysis of the Iterative Solution of Sparse Linear Systems on Multicore and Manycore Architectur

In this paper we investigate the performance-energy balance of a variety of concurrent architectures, from general-purpose and digital signal multicore systems to graphics processors (GPUs), representative of current technology. This analysis employs the

  • PDF / 486,702 Bytes
  • 11 Pages / 439.37 x 666.142 pts Page_size
  • 56 Downloads / 172 Views

DOWNLOAD

REPORT


. de Ingenier´ıa y Ciencia de Computadores, Universidad Jaume I, 12.071 Castell´ on, Spain {aliaga,castillo,jfernand,leon,al001566,quintana}@uji.es Innovative Computing Lab (ICL), University of Tennessee, Knoxville, USA [email protected]

Abstract. In this paper we investigate the performance-energy balance of a variety of concurrent architectures, from general-purpose and digital signal multicore systems to graphics processors (GPUs), representative of current technology. This analysis employs the conjugate gradient method, an important algorithm for the iterative solution of linear systems that is basically composed of the sparse matrix-vector product and other (minor) vector kernels. To allow a fair comparison, we leverage simple implementations of the numerical methods and underlying kernels, and rely only on those optimizations applied by the target compiler. Keywords: Energy efficiency · High-performance computing · Sparse linear algebra · Multicore processors · Low-power processors · GPUs

1

Introduction

Competing for the world’s first exascale system, many high performance computing (HPC) initiatives have identified the power wall as a key challenge that will have to be confronted, resulting in an unmistakable call for powerefficient systems [5,9]. At the other end of the spectrum, energy-efficient components are essential for extended battery life of mobile appliances like smart phones and tablets, and hardware companies devote considerable effort to integrating sophisticated energy-saving mechanisms into embedded devices. These two trends seem to be converging, though, and a small number of recent HPC research prototypes aim at delivering high performance-power ratios by adopting technology originally designed for the mobile market [1,2]. Although most manufacturers advertise the power-efficiency of their products by providing theoretical energy specifications, an equitable comparison between different hardware architectures remains difficult. The reason is not only that R. Wyrzykowski et al. (Eds.): PPAM 2013, Part I, LNCS 8384, pp. 772–782, 2014. c Springer-Verlag Berlin Heidelberg 2014 DOI: 10.1007/978-3-642-55224-3 72, 

Performance and Energy of the Solution of Sparse Linear Systems

773

distinct devices are often designed for one particular type of computation, but also that they are tailored for either performance or power efficiency. New energyrelated metrics have been recently proposed to analyze the balance between these two key figures [7], but the situation becomes increasingly difficult once the different levels of optimization applied to an algorithm enter the picture. In particular, extensive software optimization that results in significant performance and power improvements for one specific hardware are also likely to hamper the portability of the code to other architectures. In this paper we provide a map of the energy-performance landscape of a variety of general-purpose and specialized hardware architectures using the conjugate gradient (CG) method, a key algorithm for the numerical solution of symmetric