Publishing Without Publishers: A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data

Making available and archiving scientific results is for the most part still considered the task of classical publishing companies, despite the fact that classical forms of publishing centered around printed narrative articles no longer seem well-suited i

PDF / 2,460,082 Bytes
17 Pages / 439.37 x 666.142 pts Page_size
113 Downloads / 285 Views

DOWNLOAD

REPORT

5

Department of Humanities, Social and Political Sciences, ETH Zurich, Z¨ urich, Switzerland [email protected] 2 Department of Computer Science, VU University Amsterdam, Amsterdam, The Netherlands 3 Swiss Institute of Bioinformatics, Geneva, Switzerland [email protected] 4 Yale University School of Medicine, New Haven, CT, USA [email protected] Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USA [email protected]

Abstract. Making available and archiving scientiﬁc results is for the most part still considered the task of classical publishing companies, despite the fact that classical forms of publishing centered around printed narrative articles no longer seem well-suited in the digital age. In particular, there exist currently no eﬃcient, reliable, and agreed-upon methods for publishing scientiﬁc datasets, which have become increasingly important for science. Here we propose to design scientiﬁc data publishing as a Web-based bottom-up process, without top-down control of central authorities such as publishing companies. Based on a novel combination of existing concepts and technologies, we present a server network to decentrally store and archive data in the form of nanopublications, an RDF-based format to represent scientiﬁc data. We show how this approach allows researchers to publish, retrieve, verify, and recombine datasets of nanopublications in a reliable and trustworthy manner, and we argue that this architecture could be used for the Semantic Web in general. Evaluation of the current small network shows that this system is eﬃcient and reliable.

1

Introduction

Modern science increasingly depends on datasets, which however are left out in the classical way of publishing, i.e. through narrative (printed or online) articles in journals or conference proceedings. This means that the publications that describe scientiﬁc ﬁndings get disconnected from the data they are based on, c Springer International Publishing Switzerland 2015 M. Arenas et al. (Eds.): ISWC 2015, Part I, LNCS 9366, pp. 656–672, 2015. DOI: 10.1007/978-3-319-25007-6 38

Publishing Without Publishers: A Decentralized Approach

657

which can seriously impair the veriﬁability and reproducibility of their results. Addressing this issue raises a number of practical problems: How should one publish scientiﬁc datasets and how can one refer to them in the respective scientiﬁc publications? How can we be sure that the data will remain available in the future and how can we be sure that data we ﬁnd on the Web have not been corrupted or tampered with? Moreover, how can we refer to speciﬁc entries or subsets from large datasets? To address some of these problems, a number of scientiﬁc data repositories have appeared, such as Figshare and Dryad.1 Furthermore, Digital Object Identiﬁers (DOI) have been advocated to be used not only for articles but also for scientiﬁc data [22]. While these services certainly improve the situation of scientiﬁc data, in particular when combined with

Data Loading...

Publishing Without Publishers: A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data

Recommend Documents

Archiving Experimental Data

A Comprehensive Study of Dissemination and Data Retrieval in Secure VANET-Cloud Environment

Data Dissemination

Data Dissemination

A Modified EDDKA Routing Approach to Enhance Quality of Service (QoS)-Enabled Data Dissemination in VANETs

A Linked Data-Based Approach for Personalized Multimedia Retrieval

Enhanced Data Dissemination in a Mobile Environment

Decentralized Data Integration System

Data Quality in a Decentralized Environment

Data Acquisition and Dissemination in Sensor Networks

A Robust Color Object Analysis Approach to Efficient Image Retrieval

Systematic Approach to Engineer Decentralized Self-adaptive Systems