An STT-MRAM based reconfigurable computing-in-memory architecture for general purpose computing
- PDF / 2,201,127 Bytes
- 10 Pages / 595.276 x 790.866 pts Page_size
- 42 Downloads / 186 Views
REGULAR PAPER
An STT‑MRAM based reconfigurable computing‑in‑memory architecture for general purpose computing Yu Pan1 · Xiaotao Jia1,2 · Zhen Cheng1,2 · Peng Ouyang1 · Xueyan Wang1 · Jianlei Yang2,3 · Weisheng Zhao1,2 Received: 21 February 2020 / Accepted: 27 May 2020 © China Computer Federation (CCF) 2020
Abstract Recently, many researches have proposed computing-in-memory architectures trying to solve von Neumann bottleneck issue. Most of the proposed architectures can only perform some application-specific logic functions. However, the scheme that supports general purpose computing is more meaningful for the complete realization of in-memory computing. A reconfigurable computing-in-memory architecture for general purpose computing based on STT-MRAM (GCIM) is proposed in this paper. The proposed GCIM could significantly reduce the energy consumption of data transformation and effectively process both fix-point calculation and float-point calculation in parallel. In our design, the STT-MRAM array is divided into four subarrays in order to achieve the reconfigurability. With a specified array connector, the four subarrays can work independently at the same time or work together as a whole array. The proposed architecture is evaluated using Cadence Virtuoso. The simulation results show that the proposed architecture consumes less energy when performing fix-point or float-point operations. Keywords Computing-in-memory · Reconfigurable · STT-MRAM · General purpose computing
1 Introduction Over the past decades, as the size of data scale growing exponentially with time, the computational demands for data analytics applications are becoming even more forbidding. However, regarding conventional von Neumann architecture, the overhead of the data communication between the processor and the memory units results in huge performance degradation and energy consumption, that called von Neumann bottleneck (Kim et al. 2003; Wulf and McKee 1995). For * Xiaotao Jia [email protected] * Weisheng Zhao [email protected] 1
Beijing Advanced Innovation Certer for Big Data and Brain Computing, School of Microelectronics, Fert Beijing Research Institute, Beihang University, Beijing 100191, China
2
Beihang‑Goertek Joint Microelectronics Institute, Qingdao Research Institute, Beihang University, Qingdao 266101, China
3
Beijing Advanced Innovation Certer for Big Data and Brain Computing, School of Computer Science and Engineering, Fert Beijing Institute, Beihang University, Beijing 100191, China
example, when compares the cost of computation (a doubleprecision fused multiply add) with communication (a 64-bit read from an off-chip SRAM), the ratio of communication energy to computation energy is 50 X at 40 nm (Keckler et al. 2011). More serious a DRAM access consumes 200 times more energy than a floating-point operation (Kang et al. 2020). Such off-chip accesses become increasingly necessary as data scale grow larger, and even the cleverest latency-hiding techniques cannot conceal their overhead. In order to overcom
Data Loading...