Improving power-performance via hybrid cache for chip many cores based on neural network prediction technique
- PDF / 1,743,584 Bytes
- 12 Pages / 595.276 x 790.866 pts Page_size
- 26 Downloads / 198 Views
(0123456789().,-volV)(0123456789(). ,- volV)
TECHNICAL PAPER
Improving power-performance via hybrid cache for chip many cores based on neural network prediction technique Furat Al-Obaidy1
•
Arghavan Asad1 • Farah A. Mohammadi1
Received: 18 September 2020 / Accepted: 24 September 2020 Springer-Verlag GmbH Germany, part of Springer Nature 2020
Abstract Recently, the increasing need to run applications for significant data analytics, and the augmented demand of useful tools for big data computing systems has resulted in a cumulative necessity for efficient platforms with high performance and realizable power consumption, for example, chip multiprocessors (CMPs). Correspondingly, due to the demand for features like shrinkable sizes, and the concurrent need to pack increasing numbers of transistors into a single chip, has led to serious design challenges, consuming a significant of power within high area densities. We present a reconfigurable hybrid cache system for last level cache (LLC) by the integration of emerging designs, such as STT-RAM with SRAM memories. This approach consists of two phases: off- time and on-time. In off time, training NN is implemented while in the on-time phase, a reconfiguration cache uses a neural network (NN) learning approach to predict demanded latency of the running application. Experimental results of a three-dimensional chip with 64 cores show that the suggested design under PARSEC benchmarks provides a speedup in terms of the performance at 25% and improves energy consumption by 78.4% in comparison to non-reconfigurable pure SRAM cache architectures.
1 Introduction Recently, ML-computing-based energy optimization has gained significant attention attributed to its learning ability to harness the potential power and energy in CMP applications by predicting the optimal tasks towards enabling informed decisions under multi workloads to enhance both energy efficiency and performance. Moreover, CMPs architectures are the main platform for the execution of these workloads and can be orchestrated towards ensuring an efficient performance, power consumption, and more economical cost. In addition, un-core components, such as the cache memory contain a large number of transistor elements and consume a significant amount of power. Therefore, multicores, particularly CMPs, allow the & Furat Al-Obaidy [email protected] Arghavan Asad [email protected] Farah A. Mohammadi [email protected] 1
Department of Electrical and Computer Engineering, Ryerson University, 350 Victoria Street, Toronto, ON M5B 2K3, Canada
reconfiguration of the LLCs based on the target applications (Pagani et al. 2020). Moreover, with the increasing level of parallelism of the emerging learning-based applications on CMPs, there is an augmented need for the cores. With the rising core numbers in CMPs, power consumption is another important challenge. Typically, two main factors affect the power reducing of CMOS chips. One is the low static power consumption, which is attributed to the leakage current
Data Loading...