Hash Tables and Associative Arrays

If you want to get a book from the central library of the University of Karlsruhe, you have to order the book in advance. The library personnel fetch the book from the stacks and deliver it to a room with 100 shelves. You find your book on a shelf numbere

  • PDF / 301,813 Bytes
  • 18 Pages / 439.37 x 666.142 pts Page_size
  • 24 Downloads / 217 Views

DOWNLOAD

REPORT


If you want to get a book from the central library of the University of Karlsruhe, you have to order the book in advance. The library personnel fetch the book from the stacks and deliver it to a room with 100 shelves. You find your book on a shelf numbered with the last two digits of your library card. Why the last digits and not the leading digits? Probably because this distributes the books more evenly among the shelves. The library cards are numbered consecutively as students sign up, and the University of Karlsruhe was founded in 1825. Therefore, the students enrolled at the same time are likely to have the same leading digits in their card number, and only a few shelves would be in use if the leading digits were used. The subject of this chapter is the robust and efficient implementation of the above “delivery shelf data structure”. In computer science, this data structure is known as a hash1 table. Hash tables are one implementation of associative arrays, or dictionaries. The other implementation is the tree data structures which we shall study in Chap. 7. An associative array is an array with a potentially infinite or at least very large index set, out of which only a small number of indices are actually in use. For example, the potential indices may be all strings, and the indices in use may be all identifiers used in a particular C++ program. Or the potential indices may be all ways of placing chess pieces on a chess board, and the indices in use may be the placements required in the analysis of a particular game. Associative arrays are versatile data structures. Compilers use them for their symbol table, which associates identifiers with information about them. Combinatorial search programs often use them for detecting whether a situation has already been looked at. For example, chess programs have to deal with the fact that board positions can be reached by different sequences of moves. However, each position needs to be evaluated only once. The solution is to store positions in an associative array. One of the most widely used implementations of the join operation in relational databases temporarily stores one of the participating relations in an associative array. Scripting languages such as AWK 1

Photograph of the mincer above by Kku, Rainer Zenz (Wikipedia), Licence CC-by-SA 2.5.

82

4 Hash Tables and Associative Arrays

[7] and Perl [203] use associative arrays as their main data structure. In all of the examples above, the associative array is usually implemented as a hash table. The exercises in this section ask you to develop some further uses of associative arrays. Formally, an associative array S stores a set of elements. Each element e has an associated key key(e) ∈ Key. We assume keys to be unique, i.e., distinct elements have distinct keys. Associative arrays support the following operations: • S.insert(e : Element): S := S ∪ {e}. • S.remove(k : Key): S := S \ {e}, where e is the unique element with key(e) = k. • S.find(k : Key): If there is an e ∈ S with key(e) = k, return e; otherwise,