Common Warehouse Metamodel

  • PDF / 4,698,282 Bytes
  • 246 Pages / 547.087 x 737.008 pts Page_size
  • 36 Downloads / 196 Views

DOWNLOAD

REPORT


Cache Performance ▶ Performance Analysis of Transaction Processing Systems

Cache-Aware Query Processing ▶ Cache-Conscious Query Processing

Cache-Conscious Query Processing K ENNETH A. R OSS Columbia University, New York, NY, USA

Synonyms Cache-aware query processing; Cache-sensitive query processing

Definition Query processing algorithms are designed to efficiently exploit the available cache units in the memory hierarchy. Cache-conscious algorithms typically employ knowledge of architectural parameters such as cache size and latency. This knowledge can be used to ensure that the algorithms have good temporal and/or spatial locality on the target platform.

Historical Background Between 1980 and 2005, processing speeds improved by roughly four orders of magnitude, while memory speeds improved by less than a single order of magnitude. As a result, it is common (at the time of writing) #

2009 Springer ScienceþBusiness Media, LLC

for data accesses to RAM to require several hundred CPU cycles to resolve. Many database workloads have shifted from being I/O bound to being memory/CPUbound as the amount of memory per machine has been increasing. For such workloads, improving the locality of data-intensive operations can have a direct impact on the system’s overall performance.

Foundations A cache is a hardware unit that speeds up access to data. Several cache units may be present at various levels of the memory hierarchy, depending on the processor architecture. For example, a processor may have a small but fast Level-1 (L1) cache for data, and another L1 cache for instructions. The same processor may have a larger but slower L2 cache storing both data and instructions. Some processors may even have an L3 cache. On multicore processors, the lower level caches may be shared among groups of cores. Some initial analysis would typically be performed to determine the performance characteristics of a workload. For example, Ailamaki et al. [1] used hardware performance counters to demonstrate that several commercial systems were, at that time, suffering many L2 data cache misses and L1 instruction cache misses. Based on such observations, one can determine that the L2 data cache and L1 instruction cache are targets for performance tuning. If the operating system does not provide direct access to system parameters such as the cache size, a database system can run a calibration test to estimate the relevant parameters [13]. To get good cache performance, algorithm designers typically utilize one or more of the following general approaches:  Improve spatial locality, so that data items that are often accessed together are in the same cache lines.  Improve temporal locality for data, so that after an initial cache miss, subsequent data item accesses occur while the item is still cache-resident.

302

C

Cache-Conscious Query Processing

 Improve temporal locality for instructions, so that code that needs to be applied to many data items is executed many times while the instructions reside in the cache.  Hide the latency. La