Orthogonal Range Searching

At first sight it seems that databases have little to do with geometry. Nevertheless, many types of questions—from now on called queries—about data in a database can be interpreted geometrically. To this end we transform records in a database into points

  • PDF / 248,463 Bytes
  • 26 Pages / 547.087 x 685.984 pts Page_size
  • 100 Downloads / 203 Views

DOWNLOAD

REPORT


Orthogonal Range Searching Querying a Database

At first sight it seems that databases have little to do with geometry. Nevertheless, many types of questions—from now on called queries—about data in a database can be interpreted geometrically. To this end we transform records in a database into points in a multi-dimensional space, and we transform the queries about the records into queries on this set of points. Let’s demonstrate this with an example. salary G. Ometer born: Aug 19, 1954 salary: $3,500

4,000

3,000

19,500,000

19,559,999

date of birth

Consider a database for personnel administration. In such a database the name, address, date of birth, salary, and so on, of each employee are stored. A typical query one may want to perform is to report all employees born between 1950 and 1955 who earn between $3,000 and $4,000 a month. To formulate this as a geometric problem we represent each employee by a point in the plane. The first coordinate of the point is the date of birth, represented by the integer 10, 000 × year + 100 × month + day, and the second coordinate is the monthly salary. With the point we also store the other information we have about the employee, such as name and address. The database query asking for all employees born between 1950 and 1955 who earn between $3,000 and

Figure 5.1 Interpreting a database query geometrically

95

Chapter 5 ORTHOGONAL RANGE SEARCHING

4,000

3,000 4 2

19,500,000

19,559,999

$4,000 transforms into the following geometric query: report all points whose first coordinate lies between 19,500,000 and 19,559,999, and whose second coordinate lies between 3,000 and 4,000. In other words, we want to report all the points inside an axis-parallel query rectangle—see Figure 5.1. What if we also have information about the number of children of each employee, and we would like to be able to ask queries like “report all employees born between 1950 and 1955 who earn between $3,000 and $4,000 a month and have between two and four children”? In this case we represent each employee by a point in 3-dimensional space: the first coordinate represents the date of birth, the second coordinate the salary, and the third coordinate the number of children. To answer the query we now have to report all points inside the axisparallel box [19, 500, 000 : 19, 559, 999]×[3, 000 : 4, 000]×[2 : 4]. In general, if we are interested in answering queries on d fields of the records in our database, we transform the records to points in d-dimensional space. A query asking to report all records whose fields lie between specified values then transforms to a query asking for all points inside a d-dimensional axis-parallel box. Such a query is called a rectangular range query, or an orthogonal range query, in computational geometry. In this chapter we shall study data structures for such queries.

5.1

96

1-Dimensional Range Searching

Before we try to tackle the 2- or higher-dimensional rectangular range searching problem, let’s have a look at the 1-dimensional version. The data we are given is a set of p