Optimal Parallel Prefix Sum Computation on Three-Dimensional Torus Network
- PDF / 388,734 Bytes
- 4 Pages / 595.276 x 790.866 pts Page_size
- 47 Downloads / 214 Views
SHORT COMMUNICATION
Optimal Parallel Prefix Sum Computation on Three-Dimensional Torus Network Ashish Gupta1
Received: 14 May 2020 / Revised: 11 August 2020 / Accepted: 19 August 2020 Ó The National Academy of Sciences, India 2020
Abstract In this article, an optimal parallel algorithm is suggested for fast computation of a crucial yet fundamental numerical problem in science named ‘prefix sum’ on a popular three-dimensional torus network. The proposed parallel algorithm demands total ‘4n ? 2’ communication moves for mapping prefix sum problem on odd network sizes of n 9 n9n three-dimensional torus network, whereas total ‘4n’ communication moves is required for parallel mapping of same on even network sizes. In fact, a unit increase in each dimension from odd to even and even to odd network sizes resulted in the constant increase of 2 and 6 units, respectively, in the required communication moves for mapping prefix sum. The algorithmic performance of the proposed parallel algorithm can be compared to counter-part parallel prefix sum approaches on gridbased networks. Keywords Torus Prefix sum Broadcast Biswapped Biswapped-Torus
Among interconnects, three-dimensional torus has adopted as a parallel interconnect in many well-known fast supercomputers around the globe. According to reports by IBM (May, 2013) [1], by system share, three-dimensional torus constitutes total 6.8% among interconnects in TOP 500 list of supercomputers, and by performance share, three-dimensional torus constitutes total 17.8%. However, by system and performance share, Tree constitutes highest & Ashish Gupta [email protected] 1
Institute of Engineering and Technology, Dr. Rammanohar Lohia Avadh University, Ayodhya, India
percentage share (84% system share and 47.6% performance share) among interconnects in TOP 500 list of supercomputers. The prefix is an elementary numerical problem, since it can be utilized in computing and accomplishing various real-world applications. For example, sorting data elements using counting sort methodology by finding position of keys on the basis of their frequencies. Similarly, prefix sum problem can also be utilized to map polynomial interpolation; such as divide different coefficients for Newton’s interpolation polynomial or generalized divide difference for hermit interpolation on various parallel architectures. Our proposed parallel prefix computation approach on three-dimensional torus can be compared to counter-part parallel approaches for prefix sum on grid-based networks mentioned in [2–7]. 3-Dimensional Torus: The entire architecture of n 9 n9n three-dimensional torus composed of n3 nodes, where each node has six neighbour nodes in the direction of ?X, -X, ?Y, -Y, ?Z, and –Z-axis. Truly saying, a n 9 n9n three-dimensional torus composed of total ‘n’ two-dimensional torus networks. For instance, a 4 9 494 three-dimensional torus network composed of total four two-dimensional torus networks. For better understanding, such four two-dimensional torus networks is displayed in Fig. 1 by red,
Data Loading...