Dataset Summary Visualization with LODSight

We present a web-based tool that shows a summary of an RDF dataset as a visualization of a graph formed from classes, datatypes and predicates used in the dataset. The visualization should allow to quickly and easily find out what kind of data the dataset

  • PDF / 346,128 Bytes
  • 5 Pages / 439.37 x 666.142 pts Page_size
  • 47 Downloads / 199 Views

DOWNLOAD

REPORT


tract. We present a web-based tool that shows a summary of an RDF dataset as a visualization of a graph formed from classes, datatypes and predicates used in the dataset. The visualization should allow to quickly and easily find out what kind of data the dataset contains and its structure. It also shows how vocabularies are used in the dataset.

1

Introduction

In contrast to the RDBMS world, (RDF) datasets on the semantic web usually are not provided with a schema. There are of course RDFS/OWL vocabularies,1 but those only define sets of concepts that can be used in a dataset rather than what combinations of concepts should be used to describe the data. The relationship between a vocabulary and a dataset is thus by far not as strict as the relationship between an SQL database and its schema. If users encounter an RDF dataset they are not familiar with, finding out what kind of data it contains is nontrivial since up-to-date and complete documentation of the dataset is usually not present. The other way around, when users encounter a vocabulary they are not familiar with, they can try to learn the proper usage of the vocabulary by reading the documentation and inspecting the axioms, labels and comments in the RDFS/OWL source code. However, even the vocabulary documentation may be insufficient or missing, and the axioms do not fully specify the usage, as said above. Then the only remaining option is to look at datasets where the vocabulary is used (provided they exist). Users can obviously explore the dataset manually, e.g., using exploratory SPARQL queries. Several approaches to dataset summarization that makes such exploration easier have been implemented (discussed in Sect. 2). We present LODSight:2 a dataset summary visualization tool based on those existing approaches. LODSight is aimed to be applicable to any RDF dataset available through a SPARQL endpoint and to show the whole dataset summary in one view. It searches the dataset for typical combinations of class instances and properties. The combinations are merged into one graph and displayed as an interactive node-link JavaScript-based visualization. The application is designed to allow a non-expert user to both (a) get an overview of the data and its structure in the dataset, and, (b) learn how the vocabularies are used in it. 1 2

By vocabulary we mean ontology or vocabulary. Available at http://lod2-dev.vse.cz/lodsight.

c Springer International Publishing Switzerland 2015  F. Gandon et al. (Eds.): ESWC 2015, LNCS 9341, pp. 36–40, 2015. DOI: 10.1007/978-3-319-25639-9 7

Dataset Summary Visualization with LODSight

2

37

Related Research

The visualization in LODSight is based on the same principle as maps of ontology usage [3]. However, maps of ontology usage focus the visualization on entities from single namespace while we display all classes in one summarization regardless of namespace. Maps of ontology usage rely on YARS2 while we support remote summarization of theoretically any SPARQL endpoint. ExpLOD [4] offers a more complex approach based on bisimul