Hardware BLAST Algorithms with Multi-seeds Detection and Parallel Extension

As one of the most widely used bio-sequence searching tools, BLAST adopts index-based approach to detect the matches between two substrings by looking up a large table and processing one match per query. In this paper, we propose a systolic array approach

PDF / 545,057 Bytes
12 Pages / 430 x 660 pts Page_size
94 Downloads / 218 Views

DOWNLOAD

REPORT

Abstract. As one of the most widely used bio-sequence searching tools, BLAST adopts index-based approach to detect the matches between two substrings by looking up a large table and processing one match per query. In this paper, we propose a systolic array approach to detect string matches without using looking up tables. The pipelining systolic array is implemented as a multi-seeds detection and parallel extension pipeline engine to accelerate the ﬁrst two stages of NCBI BLAST algorithm. Diﬀerent from the index-based approach, our implementation consumes little memory resources and eliminates redundant string extensions by merging multiple adjoin seeds into a valid seed. Our FPGA implementation achieves superior performance results in both of processing element number and clock frequency over related works in the area of FPGA BLAST accelerators. The experimental results also show the speedup can reach about 17 and 48 compared to the NCBI BLASTp and TBLASTn programs for 3072-residue queries on Intel P4 CPU, respectively. Furthermore, the idea of multi-seeds detection also can be adopted in other seed-based heuristic searching applications.

1

Introduction

The comparison of DNA or protein sequences has become a fundamental task of modern molecular biology. BLAST (Basic Local Alignment Search Tool) [1] as one of the most important tools has been designed to run on commodity PC clusters at present to search for sequence similarity in genomic databases. With the exponential growth of the bio-sequence databases, such as the NCBI (National Center for Biotechnology Information) GenBank [2], which has doubled in size every 12∼16 months for the last decade and now stands at over 56 billion characters, the computational requirements for sequence comparisons have far exceeded the computing capability. General-purposed microprocessors typically provide very limited bit-level parallelism. However, sequence comparison algorithms exhibit a much higher degree of bit-level data parallelism, typically hundreds of bit-level operations can be performed in parallel. Therefore, many researchers keen on implementing BLAST algorithms in hardware to avoid the low eﬃciency in general-purposed microprocessors. Recently, FPGA chips have emerged as one promising application accelerator, using a combination of FPGAs and general-purposed CPUs to R. Woods et al. (Eds.): ARC 2008, LNCS 4943, pp. 39–50, 2008. c Springer-Verlag Berlin Heidelberg 2008

40

F. Xia, Y. Dou, and J. Xu

accelerate BLAST algorithm attracts much more attention. A number of parallel architectures have been proposed, such as Mercury BLASTn [3], [4], [5], Tree-BLAST [6], Mercury BLASTp [7], RC-BLAST [8], FPGA/FLASH Accelerator [9],Multi-engine BLASTn Accelerator [10], [11] and many commercialized system, BEE2 [12], CLC Cube [13], Mitrion [14] and DeCypher [15] et.al. have been built. Most of the current implementations adopt the index-based searching approach, which builds all kinds of tables to record the position of each word in query sequence, then drives the words (o

Data Loading...

Hardware BLAST Algorithms with Multi-seeds Detection and Parallel Extension

Recommend Documents

Parallel Architectures and Bioinspired Algorithms

Computer Arithmetic Algorithms and Hardware Implementations

Parallel Join Algorithms

Parallel Query Execution Algorithms

Extension of geometrical shock dynamics for blast wave propagation

Parallel Algorithms in Computational Science and Engineering

Introduction to Parallel Processing Algorithms and Architectures

Hardware Acceleration of EDA Algorithms Custom ICs, FPGAs and GPUs

Anomaly Detection Principles and Algorithms

Hardware Trojan Detection Using Machine Learning Technique

Active Fire Detection Algorithms

Numerical Algorithms for Modern Parallel Computer Architectures