Modelling Provenance Collection Points and Their Impact on Provenance Graphs

As many domains employ ever more complex systems-of-systems, capturing provenance among component systems is increasingly important. Applications such as intrusion detection, load balancing, traffic routing, and insider threat detection all involve monito

  • PDF / 926,382 Bytes
  • 12 Pages / 439.37 x 666.142 pts Page_size
  • 48 Downloads / 222 Views

DOWNLOAD

REPORT


2

Marymount University, Arlington, VA, USA [email protected] The MITRE Corporation, McLean, VA, USA {slscott,achapman}@mitre.org

Abstract. As many domains employ ever more complex systems-of-systems, capturing provenance among component systems is increasingly important. Applications such as intrusion detection, load balancing, traffic routing, and insider threat detection all involve monitoring and analyzing the data provenance. Implicit in these applications is the assumption that “good” provenance is captured (e.g. complete provenance graphs, or one full path). When attempting to provide “good” provenance for a complex system of systems, it is necessary to know “how hard” the provenance-enabling will be and the likely quality of the provenance to be produced. In this work, we provide analytical results and simulation tools to assist in the scoping of the provenance enabling process. We provide use cases of complex systems-of-systems within which users wish to capture provenance. We describe the parameters that must be taken into account when undertaking the provenance-enabling of a system of systems. We provide a tool that models the interactions and types of capture agents involved in a complex systems-of-systems, including the set of known and unknown systems in the environment. The tool provides an estimation of quantity and type of capture agents that will need to be deployed for provenance-enablement in a complex system that is not completely known. Keywords: Provenance  Lineage simulation  Complex systems

 Agent Based Modelling  Modelling and

1 Introduction Provenance, the record of creation, update and activities that influence a piece of data, is used to: understand if data was produced correctly (according to published methodology, or according to policy); detect suspicious behavior within complex systems; and, enable trust during cross-organizational collaboration [3]. The utility of the provenance stream for these purposes is tied to what information is actually collected, and how far through the system the provenance can “see”. Approved for Public Release #16-0858. The authors’ affiliation with The MITRE Corporation is provided for identification purposes only, and is not intended to convey or imply MITRE’s concurrence with, or support for, the positions, opinions or viewpoints expressed by the author. © Springer International Publishing Switzerland 2016 M. Mattoso and B. Glavic (Eds.): IPAW 2016, LNCS 9672, pp. 146–157, 2016. DOI: 10.1007/978-3-319-40593-3_12

Modelling Provenance Collection Points and Their Impact

147

In our experience, when approached by government organizations seeking to become provenance aware, the first question becomes: How much of the system must be provenance aware in order to utilize the provenance data stream in the desired manner? One of the first considerations is how many capture agents are needed to have good coverage of the system of systems. The next question is, which system(s), if provenance-capture enabled, will give the most “bang for the buck”? In ot