WODII: a solution to process SPARQL queries over distributed data sources

PDF / 775,118 Bytes
8 Pages / 595.276 x 790.866 pts Page_size
94 Downloads / 212 Views

(0123456789().,-volV)(0123456789(). ,- volV)

WODII: a solution to process SPARQL queries over distributed data sources Ahmed Rabhi1

•

Rachida Fissoune1

Received: 2 March 2019 / Revised: 10 July 2019 / Accepted: 19 October 2019 Springer Science+Business Media, LLC, part of Springer Nature 2019

Abstract The web of data can be seen as a distributed environment hosting structured and linked data based on Semantic Web standards. This is one of the promising features for Semantic Web developers who would benefit from having the possibility to remotely access different RDF repositories, available on the web, in order to collect fragments of information from several sources and combine the resulting parts in an integrated answer. In this paper, we propose an index-based solution, Web of Data Information Integrator (WoDII), to process SPARQL queries over independent data sources without having a prior knowledge of the sources contributing to the answer. By relying on an index, the system avoids non-relevant sources and maps each selected source to a cluster of sub-queries, as a result, network traffic decreases, making the process less dependent on the quality of the connection flow. Keywords SPARQL Web of data Aggregated search Ontology-based data access

1 Introduction The Web is evolving from a ‘‘Web of linked documents’’ into a ‘‘Web of linked data’’ providing better opportunities for sharing and searching information. Actually, the web of data can be seen as a giant collection of graphs containing structured data in machine-readable format based on semantic web design principals and standards, thus providing semantic developers with an effective tool to remotely access data in the web. The linking open data (LOD) cloud forms a large graph consisting of billions of structured RDF data distributed on various Data sets available on the web. These data sets are accessed via SPARQL Endpoints that allow SPARQL queries execution. A sought information may not exist entirely in a single RDF repository and could require retrieving its parts from several sources, moreover, SPARQL Endpoint are developed and managed & Ahmed Rabhi [email protected] Rachida Fissoune [email protected] 1

ENSA of Tangier Abdelmalek Essaadi University, Tangier, Morocco

independently and have varying performances (execution time and availability). Consequently, the user has to execute multiple SPARQL queries over these Endpoints to aggregate fragments of data instead of writing a single query integrating all the parts of the information, which becomes complex depending on the query complexity. Therefore, it is necessary to set up an aggregated search engine able to distribute the process of SPARQL queries over several Endpoints and integrate the retrieved data fragments into a unified answer. Actually, several studies were carried out with the aim of executing SPARQL queries on distributed data sources in the Web of Data and joining results in a single final answer. Considering the distribution of data and its sources’ indep

Data Loading...

WODII: a solution to process SPARQL queries over distributed data sources

Recommend Documents

MINDS: A Translator to Embed Mathematical Expressions Inside SPARQL Queries

Range Queries over Encrypted Data

LSQ: The Linked SPARQL Queries Dataset

QueryVOWL: Visual Composition of SPARQL Queries

Nearest Neighbor Queries over Encrypted Data

K-Nearest Neighbor Queries Over Encrypted Data

Federated SPARQL Queries Processing with Replicated Fragments

Secure Distributed Queries over Large Sets of Personal Home Boxes

Optimized distributed large-scale analytics over decentralized data sources with imperfect communication

Load Shedding for Window Queries Over Continuous Data Streams

Relevant Query Answering over Streaming and Distributed Data A Study

Privacy-Preserving Two-Party Skyline Queries Over Horizontally Partitioned Data