Radix-10 Restoring Square Root for 6-input LUTs Programmable Devices

  • PDF / 1,509,711 Bytes
  • 26 Pages / 439.37 x 666.142 pts Page_size
  • 3 Downloads / 146 Views

DOWNLOAD

REPORT


Radix-10 Restoring Square Root for 6-input LUTs Programmable Devices Martín Vázquez1 · Marcelo Tosini1 · Lucas Leiva1 Received: 2 March 2020 / Revised: 3 October 2020 / Accepted: 10 October 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract This paper proposes efficient fixed-point and floating-point implementations for radix10 square root in Xilinx FPGAs devices. The method implements digit recurrence with restoring algorithm, which supports the three decimal floating-point (DFP) types specified in the IEEE 754-2008 standard. The technique used for restoring is optimal and novel. The designs use new techniques based on the efficient utilization of dedicated resources in the programmable devices. Implementations were made in Xilinx 7-series devices. For fixed-point square root, they are capable of operating up to 212 MHz for p=7, 197 MHz for p=16, and 190 MHz for p=34. As for DFP square root, the operation frequency obtained is 194 MHz for p=7, 183 MHz for p=16, and 174 MHz for p=34. The proposed architecture achieves better computation times than related works. Keywords Square root · Digit-recurrence algorithm · Decimal arithmetic · Floating-point representation · FPGA

1 Introduction The demand for human-oriented applications has increased significantly over the last years. Such cases are e-commerce, financial analysis, banking transactions, accounting, and Internet-of-Things (IoT) applications, among others [11,38]. In this sense, McKinsey Global Institute reports that IoT businesses will deliver US$ 6.2 trillion in revenue by 2025 [27]. These applications make intensive use of decimal numbers (integer and floating point) and frequently require matching those obtained “manually.”

B

Lucas Leiva [email protected] Martín Vázquez [email protected] Marcelo Tosini [email protected]

1

Computer and Systems Department, UNICEN, Tandil, Argentina

Circuits, Systems, and Signal Processing

Furthermore, the errors introduced by binary arithmetic may violate legal or precision requirements [7]. Due to the increasing interest in the use of decimal computing, IEEE includes specifications in the IEEE 754-2008 standard for decimal floating-point (DFP) arithmetic [22]. There are multiple solutions for DFP arithmetic solved by hardware [12,13,45] and software [3,6,8,9]. Software implementations of decimal floating-point operations are between one and two orders of magnitude slower than hardware solutions [2,44]. This fact has led general-purpose microprocessor manufacturers to provide dedicated decimal floating-point hardware in their products, such as IBM System Z9 [13], IBM POWER6, and z10 processors [45]. However, few third-party IP cores for decimal hardware computation have been designed to take advantage of the Field Programmable Gate Array (FPGA) technology. Programmable logic is one of the main technological alternatives for hardware acceleration; thus, decimal cores can be used in the context of high-performance computing (HPC). In addition, the investigations a