Using Feature-Based Description Logics to avoid Duplicate Elimination in Object-Relational Query Languages

  • PDF / 839,266 Bytes
  • 9 Pages / 595.276 x 790.866 pts Page_size
  • 19 Downloads / 204 Views

DOWNLOAD

REPORT


TECHNICAL CONTRIBUTION

Using Feature-Based Description Logics to avoid Duplicate Elimination in Object-Relational Query Languages David Toman1

· Grant Weddell1

Received: 31 October 2019 / Accepted: 28 May 2020 © Gesellschaft für Informatik e.V. and Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract A sound inference procedure is presented for removing operations that eliminate duplicates in queries formulated in a bag-algebra. The procedure is shown complete for positive queries over finite databases, and operates by appeal to logical consequence problems for feature-based description logics in which a TBox embeds an object-relational schema. For unions of conjunctive queries in which an embedded schema excludes cover constraints, the procedure runs in PTIME, and in EXPTIME otherwise.

1 Introduction Query plans for queries in SQL are usefully abstracted as expressions in a bag-algebra, and are usually required to compute answers that are sets of tuples. Consequently, such plans may ultimately require operations that perform tuple duplicate elimination (TDE) in query answers. Such operations often constitute a considerable performance penalty, e.g., by inducing a temporary store and sorting of intermediate results that would otherwise be avoided by pipelining which often suffices for the remaining operations. In this paper, we show how dialects of the FunDL family of feature-based description logics (DLs) [9] can be usefully employed in rewrite rules that reduce the scope or eliminate the occurrence of TDE operations for a bag-algebra over object-relational data sources, thus controlling and possibly eliminating the need for unbounded store and for sorting in query evaluation. We focus on two FunDL dialects with respective EXPTIME and PTIME complexity for their logical consequence problems: partial-DLFDE and partial-CFDnc . In common with all FunDL dialects, these logics were designed for capturing and integrating data sources that include a variety of common equality and tuple generating dependencies. For example, either logic is capable of embedding in a

B

so-called TBox the object-relational schema illustrated in Fig. 1, presumably the metadata for a hypothetical university data source. Here: (1) single arrows are attributes, realized as features in partial-DLFDE and partial-CFDnc (denoting partial unary functions), and (2) underlined attributes and double arrows indicate primary keys and inheritance respectively, and are embedded with so-called path functional dependencies (PFDs), a concept constructor common to all member FunDL dialects that generalizes the notions of primary keys, uniqueness constraints and functional dependencies ubiquitous in object-relational schemata. Figure 1 itself can be viewed as a purely relational schema with SQL tables. For example, consider the following SQL data definition for STUDENT and TAKES:

David Toman [email protected] Grant Weddell [email protected]

1

Cheriton School of Computer Science, University of Waterloo, 200 University Ave. W., Waterloo