P-NoC: Performance Evaluation and Design Space Exploration of NoCs for Chip Multiprocessor Architecture Using FPGA

  • PDF / 1,211,981 Bytes
  • 25 Pages / 439.37 x 666.142 pts Page_size
  • 102 Downloads / 237 Views

DOWNLOAD

REPORT


P‑NoC: Performance Evaluation and Design Space Exploration of NoCs for Chip Multiprocessor Architecture Using FPGA Khyamling Parane1   · B. M. Prabhu Prasad1 · Basavaraj Talawar1

© Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract The network-on-chip (NoC) has emerged as an efficient and scalable communication fabric for chip multiprocessors (CMPs) and multiprocessor system on chips (MPSoCs). The NoC architecture, the routers micro-architecture and links influence the overall performance of CMPs and MPSoCs significantly. We propose P-NoC: an FPGA-based parameterized framework for analyzing the performance of NoC architectures based on various design decision parameters in this paper. The mesh and a multi-local port mesh (ML-mesh) topologies have been considered for the study. By fine-tuning various NoC parameters and synthesizing on the FPGA, identify that the performance of NoC architectures are influenced by the configuration of router parameters and the interconnect. Experiments show that the flit width, buffer depth, virtual channels parameters have a significant impact on the FPGA resources. We analyze the performance of the NoCs on six traffic patterns viz., uniform, bit shuffle, random permutation, transpose, bit complement and nearest neighbor. Configuring the router and the interconnect parameters, the ML-mesh topology yields 75% lesser utilization of FPGA resources compared to the mesh. The ML-mesh topology shows an improvement of 33.2% in network latency under localized traffic pattern. The mesh and ML-mesh topologies have 0.53× and 0.1× higher saturation throughput under nearest neighbor traffic compared to uniform random traffic. Keywords  FPGA · Performance · Network-on-chip · NOCs · Mesh topology · System-onchip

Khyamling Parane and B. M. Prabhu Prasad contributed equally to this research work. * Khyamling Parane [email protected] B. M. Prabhu Prasad [email protected] Basavaraj Talawar [email protected] 1



SPARK Lab, Department of CSE, National Institute of Technology, Surathkal, Karnataka, India

13

Vol.:(0123456789)



K. Parane et al.

1 Introduction Multiprocessor system-on-chips (MPSoCs) and chip multiprocessors (CMPs) continue to expand in size, complexity and number of processing elements (PEs)  [1, 2]. CMPs and MPSoCs require a scalable and efficient communication fabric as the traditional point-topoint bus-based interconnection experiences high latencies and lower throughputs  [1–3]. The NoCs are a modular, switching based communication fabric that interconnect PEs through routers and links arranged in a bandwidth demand driven topology  [4–6]. The choice of NoC in MPSoCs influence the latency, silicon area, speed and throughput. For these reasons, a parameterized model is needed for early estimation of the NoC architecture’s silicon area and performance that match the target application [7]. The NoCs implementations distribute traffic to avoid congestion which operating under strict bandwidth performance and energy constraints  [8–1