Grid and Cloud Database Management

Since the 1990s Grid Computing has emerged as a paradigm for accessing and managing distributed, heterogeneous and geographically spread resources, promising that we will be able to access computer power as easily as we can access the electric power grid.

  • PDF / 401,697 Bytes
  • 24 Pages / 439.37 x 666.142 pts Page_size
  • 26 Downloads / 209 Views

DOWNLOAD

REPORT


Distributed Data Management with OGSA–DAI Michael J. Jackson, Mario Antonioletti, Bartosz Dobrzelecki, and Neil Chue Hong

Abstract OGSA–DAI provides a framework for sharing and managing distributed data. OGSA–DAI is highly customizable and can be used to manage, share and process distributed data (e.g. relational, XML, files and RDF triples). It does this by executing workflows that can encapsulate complex distributed data management scenarios in which data from one or more sources can be accessed, updated, combined and transformed. Moreover, the data processing capabilities provided by OGSA–DAI are further augmented by a powerful distributed query processor and relational views component that allow distributed data sources to be viewed and queried as if they were a single resource. OGSA–DAI allows researchers and business users to move away from logistical and technical concerns such as data locations, data models, data transfers and optimization strategies for data integration and instead focus on application-specific data analysis and processing.

4.1 Introduction The Open Grid Services Architecture–Data Access and Integration Services (OGSA–DAI) framework has, since its inception in 2002, been designed to serve as a solution for complex distributed data management challenges in academia, industry and commerce. OGSA–DAI provides an environment for the execution of complex distributed data management scenarios in which data from multiple sources and of multiple types (e.g. relational, XML, files, RDF triple stores, web services) can be accessed, updated, combined, filtered, transformed and delivered.

M.J. Jackson ()  M. Antonioletti  B. Dobrzelecki  N.C. Hong EPCC, The University of Edinburgh, James Clark Maxwell Building, The King’s Buildings, Mayfield Road, Edinburgh EH9 3JZ, UK e-mail: [email protected]; [email protected]; [email protected]; [email protected] S. Fiore and G. Aloisio (eds.), Grid and Cloud Database Management, DOI 10.1007/978-3-642-20045-8 4, © Springer-Verlag Berlin Heidelberg 2011

63

64

M.J. Jackson et al.

Instead of being tailored as a solution to a specific distributed data management problem, OGSA–DAI has been designed to be extensible. It allows customizations to be made for individual application-specific requirements, whether this is in terms of the data resources supported, the data processing operations executed or the way in which the framework is accessed or exposed. Data streaming is fundamental to OGSA–DAI enabling the processing of large data sets and the implicit exploitation of any parallelism available on the machines on which it runs. OGSA–DAI includes a distributed query processor for relational data sources [1], which has its origin in the OGSA–DQP distributed query processor [2, 3] developed by the Universities of Manchester and Newcastle. This distributed query processor allows complex queries involving distributed data sources to be expressed declaratively. Features such as these have facilitated OGSA–DAI’s adoption in the so