Efficient Compilation of Regular Path Queries

  • PDF / 1,960,425 Bytes
  • 17 Pages / 612.419 x 808.052 pts Page_size
  • 22 Downloads / 246 Views

DOWNLOAD

REPORT


FACHBEITRAG

Efficient Compilation of Regular Path Queries Frank Tetzel1 · Wolfgang Lehner1 · Romans Kasperovics2 Received: 13 December 2019 / Accepted: 12 August 2020 © Gesellschaft für Informatik e.V. and Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract Ad hoc code generation is a state-of-the-art processing paradigm for database execution engines. It minimizes resource consumption by generating specialized code, tailored and streamlined for the single query at hand. In this work, we apply ad hoc code generation to regular path queries (RPQs), an advanced query type in declarative graph query languages. We investigate code generation from multiple angles. We propose COAT, an embedded domain specific language (EDSL) in C++ to improve accessibility of code generation by simplifying the interaction with compiler APIs. Furthermore, we analyze and compare two back ends for COAT providing the just-in-time (JIT) compilation functionality: LLVM, a compiler framework popularly used in databases for code generation, and AsmJit, a JIT assembler with very low compilation latency. We evaluate various compilation techniques for RPQs on different synthetic graph workloads.

Keywords Regular path queries · Code generation · Embedded domain specific language

1 Introduction In recent years, it is becoming more widespread among relational databases to use just-in-time (JIT) compilation to generate machine code for optimized query plans and execute them natively on the hardware. More and more commercial vendors [7, 26] and research prototypes [5, 12, 13, 15, 19, 20] employ code generation instead of interpretation to evaluate a query plan, or adaptively switch between interpretation and code generation in a multi-stage approach [10]. JIT compilation provides benefits for other data models and processing systems too, such as graph processing. Up until now, only a few graph databases [1, 21] and domain specific languages (DSLs) for graph algorithms [8, 14] make use of ad hoc code generation. In this paper, we will

 Frank Tetzel

[email protected] Wolfgang Lehner [email protected] Romans Kasperovics [email protected] 1

Database Systems Group, TU Dresden, Dresden, Germany

2

SAP SE, Walldorf, Germany

focus on applying JIT compilation techniques to declarative graph processing, particularly regular path queries (RPQs). The advent of main-memory databases shifted the focus of the processing system from efficient disk access to efficient CPU utilization. Ad hoc code generation enables databases to minimize the code which evaluates a query plan to the essential instructions required for the specific query at hand and therefore leads to a higher CPU utilization. There are two points in time when code specialization can be employed: ahead-of-time (AOT) and just-in-time (JIT). Code specialization during AOT compilation is part of the compilation process of the database system. Programming languages like C++ provide language features to ease the generation of specialized code from a generic