Systolic architecture for adaptive block FIR filter for throughput using distributed arithmetic
- PDF / 1,815,769 Bytes
- 9 Pages / 595.276 x 790.866 pts Page_size
- 77 Downloads / 198 Views
Systolic architecture for adaptive block FIR filter for throughput using distributed arithmetic Ch Pratyusha Chowdari1 · J. B. Seventline2 Received: 27 April 2020 / Accepted: 10 August 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract In this paper, the design of distributed arithmetic (DA) finite impulse response (FIR) filter using block least mean square algorithm (BLMS) based on systolic array architecture with parallel data processing units is presented. A high level of parallelism is observed in the proposed work, improving the efficiency of variable coefficient FIR structure. The parallel look-up tables (LUT) in combination with the shift accumulator emulate multiply and accumulate operations. In order to reduce the number of clock cycles, B parallel LUTs are used, where B is the coefficient size. The block processing in BLMS Adaptive FIR Filter of block length L gives L times high throughput. This structure accepts block of input samples and generates block of output in each clock cycle. It requires less number of registers for computing filter output response and weight increment vector as memory reuse concept is used for implementing registers. This structure doesn’t require multiplexer. The designed DA based FIR filter is implemented on FPGA. The implementation results shows that the manuscript presents a high speed and low power architecture. The proposed structure provides 17.3 times less power and 9.5 times increase in throughput when compared to existing designs. Keywords Block least mean square (BLMS) · Distributed arithmetic (DA) · Adaptive block finite impulse response (ADBFIR) filter · Systolic architecture
1 Introduction Adaptive filters are used in channel estimation, echo and noise cancellation, system identification etc. The finite impulse response (FIR) digital filters are used in different digital signal processing (DSP) applications. Many of the applications require real times processing to get better performance and low power. Multipliers are the basic elements in FIR filters, but filters with large filter length require large number of multipliers, which consume power and area of the hardware, as complexity of multiplier is higher than adder’s complexity. There are many multiplier less alternative approaches listed in literature and are of two types, one is shift add based * Ch Pratyusha Chowdari [email protected] J. B. Seventline [email protected] 1
GRIET, Hyderabad 500090, India
GITAM University, Visakhapatnam 530045, India
2
and other one is memory based. Whereas shift add based designs are common sub-expression elimination (CSE) based or multiple constant multiplication (MCM) based. We focus on memory based designs, which use LUT and adders. Memory based designs are further classified into two types: one is LUT multiplier based and other one is distributed arithmetic (DA) based. DA uses bit serial operations. Several techniques (Mohanty et al. 2015; Allred et al. 2005; Mohanty and Meher 2012; Park and Meher 2013; Mehe
Data Loading...