Towards the Domain Agnostic Generation of Natural Language Explanations from Provenance Graphs for Casual Users

As more systems become PROV-enabled, there will be a corresponding increase in the need to communicate provenance data directly to users. Whilst there are a number of existing methods for doing this — formally, diagrammatically, and textually — there are

PDF / 501,606 Bytes
12 Pages / 439.37 x 666.142 pts Page_size
69 Downloads / 277 Views

DOWNLOAD

REPORT

Abstract. As more systems become PROV-enabled, there will be a corresponding increase in the need to communicate provenance data directly to users. Whilst there are a number of existing methods for doing this — formally, diagrammatically, and textually — there are currently no application-generic techniques for generating linguistic explanations of provenance. The principal reason for this is that a certain amount of linguistic information is required to transform a provenance graph — such as in PROV — into a textual explanation, and if this information is not available as an annotation, this transformation is presently not possible. In this paper, we describe how we have adapted the common ‘consensus’ architecture from the ﬁeld of natural language generation to achieve this graph transformation, resulting in the novel PROVglish architecture. We then present an approach to garnering the necessary linguistic information from a PROV dataset, which involves exploiting the linguistic information informally encoded in the URIs denoting provenance resources. We ﬁnish by detailing an evaluation undertaken to assess the eﬀectiveness of this approach to lexicalisation, demonstrating a signiﬁcant improvement in terms of ﬂuency, comprehensibility, and grammatical correctness.

1

Introduction

As organisations begin to understand the value of storing and utilising PROV data [13], they will increasingly ﬁnd scenarios where it is useful to show that data to their users. Where resources allow, the best interfaces to this data will likely be bespoke creations, tailored to the speciﬁc needs of the application. However, we speculate that in many cases the resources will not be made available to take this approach, motivating the search for an application-generic way of communicating provenance to casual users. In this vein, there are already a number of diﬀerent ways for communicating PROV data to human users in formal [14], diagrammatic [5,17], and linguistic forms [16]. The utility of these various approaches depends on a number of factors but, perhaps, most importantly the user and their familiarity with the intricacies of both PROV and the application context. For example, whilst it is c Springer International Publishing Switzerland 2016 M. Mattoso and B. Glavic (Eds.): IPAW 2016, LNCS 9672, pp. 95–106, 2016. DOI: 10.1007/978-3-319-40593-3 8

96

D.P. Richardson and L. Moreau

a very useful tool in a suitable context, it would not be appropriate to use the PROV-N notation to communicate with the vast majority of users. Likewise, the diagrammatic forms of representing PROV are also potentially inaccessible to many users who would perhaps have diﬃculty understanding mathematical graphs. A competent speaker of a particular language, on the other hand, is presumably far more likely to understand a well-worded provenance explanation, than understand a diagrammatic representation in a format that they have not previously encountered. Linguistic interfaces are of further use in contexts where a visual interface might be inappropria

Data Loading...

Towards the Domain Agnostic Generation of Natural Language Explanations from Provenance Graphs for Casual Users

Recommend Documents

Natural Language Generation Systems

Environment-Agnostic Multitask Learning for Natural Language Grounded Navigation

Generation of Verification Artifacts from Natural Language Descriptions

Knowledge Graphs and Natural-Language Processing

Attentive Natural Language Generation from Abstract Meaning Representation

Model Generation for Natural Language Interpretation and Analysis

Modelling Provenance Collection Points and Their Impact on Provenance Graphs

Case-Based Approach to Automated Natural Language Generation for Obituaries

Evaluation of Embeddings in Medication Domain for Spanish Language Using Joint Natural Language Understanding

Segmentation from Natural Language Expressions

Empirical Methods in Natural Language Generation Data-oriented M

Alternative Explanations from Feminist Theories: Towards a Feminist Framework for the Europeanisation Process