P-NoC: Performance Evaluation and Design Space Exploration of NoCs for Chip Multiprocessor Architecture Using FPGA

PDF / 1,211,981 Bytes
25 Pages / 439.37 x 666.142 pts Page_size
102 Downloads / 341 Views

P‑NoC: Performance Evaluation and Design Space Exploration of NoCs for Chip Multiprocessor Architecture Using FPGA Khyamling Parane1 · B. M. Prabhu Prasad1 · Basavaraj Talawar1

© Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract The network-on-chip (NoC) has emerged as an efficient and scalable communication fabric for chip multiprocessors (CMPs) and multiprocessor system on chips (MPSoCs). The NoC architecture, the routers micro-architecture and links influence the overall performance of CMPs and MPSoCs significantly. We propose P-NoC: an FPGA-based parameterized framework for analyzing the performance of NoC architectures based on various design decision parameters in this paper. The mesh and a multi-local port mesh (ML-mesh) topologies have been considered for the study. By fine-tuning various NoC parameters and synthesizing on the FPGA, identify that the performance of NoC architectures are influenced by the configuration of router parameters and the interconnect. Experiments show that the flit width, buffer depth, virtual channels parameters have a significant impact on the FPGA resources. We analyze the performance of the NoCs on six traffic patterns viz., uniform, bit shuffle, random permutation, transpose, bit complement and nearest neighbor. Configuring the router and the interconnect parameters, the ML-mesh topology yields 75% lesser utilization of FPGA resources compared to the mesh. The ML-mesh topology shows an improvement of 33.2% in network latency under localized traffic pattern. The mesh and ML-mesh topologies have 0.53× and 0.1× higher saturation throughput under nearest neighbor traffic compared to uniform random traffic. Keywords FPGA · Performance · Network-on-chip · NOCs · Mesh topology · System-onchip

Khyamling Parane and B. M. Prabhu Prasad contributed equally to this research work. * Khyamling Parane [email protected] B. M. Prabhu Prasad [email protected] Basavaraj Talawar [email protected] 1

SPARK Lab, Department of CSE, National Institute of Technology, Surathkal, Karnataka, India

13

Vol.:(0123456789)

K. Parane et al.

1 Introduction Multiprocessor system-on-chips (MPSoCs) and chip multiprocessors (CMPs) continue to expand in size, complexity and number of processing elements (PEs) [1, 2]. CMPs and MPSoCs require a scalable and efficient communication fabric as the traditional point-topoint bus-based interconnection experiences high latencies and lower throughputs [1–3]. The NoCs are a modular, switching based communication fabric that interconnect PEs through routers and links arranged in a bandwidth demand driven topology [4–6]. The choice of NoC in MPSoCs influence the latency, silicon area, speed and throughput. For these reasons, a parameterized model is needed for early estimation of the NoC architecture’s silicon area and performance that match the target application [7]. The NoCs implementations distribute traffic to avoid congestion which operating under strict bandwidth performance and energy constraints [8–1

Data Loading...

P-NoC: Performance Evaluation and Design Space Exploration of NoCs for Chip Multiprocessor Architecture Using FPGA

Recommend Documents

Multiprocessor Systems on Chip Design Space Exploration

Architecture Exploration of FPGA Based Accelerators for BioInformatics Applications

Dynamic, Tagless Cache Coherence Architecture in Chip Multiprocessor

Symmetric Multiprocessor Design for Hybrid CPU/FPGA SoCs

A Fuzzy Cuckoo-Search Driven Methodology for Design Space Exploration of Distributed Multiprocessor Embedded Systems

Embedded Software Design and Programming of Multiprocessor System-on-Chip

Multiprocessor System-on-Chip Hardware Design and Tool Integration

Design Space Exploration for Aerospace IoT Products

Fast Performance Estimation and Design Space Exploration of SSD Using AI Techniques

Design space exploration and optimization using self-organizing maps

Trade-Off Exploration for Target Tracking Application in a Customized Multiprocessor Architecture

Testable Architecture Design for Programmable Cellular Automata on FPGA Using Run-Time Dynamically Reconfigurable Look-U