Extracting Formal Models from Normative Texts

Normative texts are documents based on the deontic notions of obligation, permission, and prohibition. Our goal is model such texts using the C-O Diagram formalism, making them amenable to formal analysis, in particular verifying that a text satisfies pro

  • PDF / 203,546 Bytes
  • 6 Pages / 439.37 x 666.142 pts Page_size
  • 89 Downloads / 294 Views

DOWNLOAD

REPORT


CSE, Chalmers University of Technology and University of Gothenburg, Gothenburg, Sweden {john.j.camilleri,gerardo}@cse.gu.se 2 IMCS, University of Latvia, Riga, Latvia [email protected]

Abstract. Normative texts are documents based on the deontic notions of obligation, permission, and prohibition. Our goal is model such texts using the C-O Diagram formalism, making them amenable to formal analysis, in particular verifying that a text satisfies properties concerning causality of actions and timing constraints. We present an experimental, semi-automatic aid to bridge the gap between a normative text and its formal representation. Our approach uses dependency trees combined with our own rules and heuristics for extracting the relevant components. The resulting tabular data can then be converted into a C-O Diagram. Keywords: Information extraction

1

· Normative texts · C-O diagrams

Introduction

Normative texts are concerned with what must be done, may be done, or should not be done (deontic norms). This class of documents includes contracts, terms of services and regulations. Our aim is to be able to query such documents, by first modelling them in the deontic-based C-O Diagram [4] formal language. Models in this formalism can be automatically converted into networks of timed automata [1], which are amenable to verification. There is, however, a large gap between the natural language texts as written by humans, and the formal representation used for automated analysis. The task of modelling a text is completely manual, requiring a good knowledge of both the domain and the formalism. In this paper we present a method which helps to bridge this gap, by automatically extracting a partial model using NLP techniques. We present here our technique for processing normative texts written in natural language and building partial models from them by analysing their syntactic structure and extracting relevant information. Our method uses dependency structures obtained from a general-purpose statistical parser, namely the Stanford parser [3], which are then processed using custom rules and heuristics that we have specified based on a small development corpus in order to produce a table of predicate candidates. This can be seen as a specific information extraction task. While this method may only produce a partial model which requires further post-editing by the user, we aim to save the most tedious work so that the user (knowledge engineer) can focus better on formalisation details. c Springer International Publishing Switzerland 2016  E. M´ etais et al. (Eds.): NLDB 2016, LNCS 9612, pp. 403–408, 2016. DOI: 10.1007/978-3-319-41754-7 40

404

2

J.J. Camilleri et al.

Extracting Predicate Candidates

The proposed approach is application-specific but domain-independent, assuming that normative texts tend to follow a certain specialised style of natural language, even though there are variations across and within domains. We do not impose any grammatical or lexical restrictions on the input texts, therefore we first apply the