Compressed Suffix Array

  • PDF / 3,702,950 Bytes
  • 95 Pages / 547.087 x 737.008 pts Page_size
  • 6 Downloads / 216 Views

DOWNLOAD

REPORT


C

C

Cache-Oblivious B-Tree 2005; Bender, Demaine, Farach-Colton ROLF FAGERBERG Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark

holds for any B and M, it holds for all levels simultaneously. The subject here is that of efficient cache-oblivious data structures for the ordered dictionary problem, i. e., the problem of storing elements with keys from an ordered universe while supporting searches, insertions, deletions, and range searches. In full generality, searches are predecessor searches, returning the element with the largest key smaller than or equal to the key searched for.

Keywords and Synonyms Cache-oblivious search tree; Cache-oblivious dictionary

Problem Definition Computers contain a hierarchy of memory levels, with vastly differing access times. Hence, the time for a memory access depends strongly on what is the innermost level containing the data accessed. In analysis of algorithms, the standard RAM (or von Neumann) model cannot capture this effect, and external memory models were introduced to better model the situation. The most widely used of these models is the two-level I/O-model [1], also called the External Memory model or the Disk Access model. The I/O-model approximates the memory hierarchy by modeling two levels, with the inner level having size M, the outer level having infinite size, and transfers between the levels taking place in blocks of B consecutive elements. The cost of an algorithm is the number of memory transfers it makes. The cache-oblivious model, introduced by Frigo et al. [18], elegantly generalizes the I/O-model to a multilevel memory model by a simple measure: the algorithm is not allowed to know the value of B and M. More precisely, a cache-oblivious algorithm is an algorithm formulated in the RAM model, but analyzed in the I/O-model, with an analysis valid for any value of B and M. Cache replacement is assumed to take place automatically by an optimal off-line cache replacement strategy. Since the analysis

Key Results The first cache-oblivious dictionary was given by Prokop [21], who showed how to lay out a static binary tree in memory such that searches take O(log B n) memory transfers. This layout, often called the van Emde Boas layout because it is reminiscent of the classic van Emde Boas data structure, also ensures that range searches take O(log B n + k/B) memory transfers [2], where k is the size of the output. Both bounds are optimal for comparisonbased searching. The first dynamic, cache-oblivious dictionary was given by Bender et al. [10]. Making use of a variant of the van Emde Boas layout, a density maintenance algorithm of the type invented by Itai et al. [19], and weight-balanced B-trees [5], they arrived at the following results: Theorem 1 ([10]) There is a cache-oblivious dictionary structure supporting searches in O(log B n) memory transfers, and insertions and deletions in amortized O(logB n) memory transfers. Theorem 2 ([10]) There is a cache-oblivious dictionary structure supporting searches in O(log B n) memor