Classification of periodic arrivals in event time data for filtering computer network traffic

  • PDF / 1,166,254 Bytes
  • 14 Pages / 595.276 x 790.866 pts Page_size
  • 56 Downloads / 172 Views

DOWNLOAD

REPORT


Classification of periodic arrivals in event time data for filtering computer network traffic Francesco Sanna Passino1

· Nicholas A. Heard1

Received: 22 May 2019 / Accepted: 8 April 2020 © The Author(s) 2020

Abstract Periodic patterns can often be observed in real-world event time data, possibly mixed with non-periodic arrival times. For modelling purposes, it is necessary to correctly distinguish the two types of events. This task has particularly important implications in computer network security; there, separating automated polling traffic and human-generated activity in a computer network is important for building realistic statistical models for normal activity, which in turn can be used for anomaly detection. Since automated events commonly occur at a fixed periodicity, statistical tests using Fourier analysis can efficiently detect whether the arrival times present an automated component. In this article, sequences of arrival times which contain automated events are further examined, to separate polling and non-periodic activity. This is first achieved using a simple mixture model on the unit circle based on the angular positions of each event time on the p-clock, where p represents the main periodicity associated with the automated activity; this model is then extended by combining a second source of information, the time of day of each event. Efficient implementations exploiting conjugate Bayesian models are discussed, and performance is assessed on real network flow data collected at Imperial College London. Keywords Circular statistics · Network flow data · Mixture modelling · Periodic arrival times · Periodicity detection · Statistical cyber-security · Wrapped normal

1 Introduction Event time data exhibit periodic behaviour in many real-life applications, for example astrophysics (Cicuttin et al. 1998), bioinformatics (Kocak et al. 2013), object tracking (Li et al. 2010) and computer networks (Heard et al. 2014; PriceWilliams et al. 2017). The periodic arrival times can often be mixed with non-periodic events. Therefore, to model the generating process appropriately, it is required to correctly distinguish the event types. This article proposes a statistical method for classification of periodic arrivals within a sequence of event times.

The authors gratefully acknowledge funding from the EPSRC and the Heilbronn Institute for Mathematical Research.

B

Francesco Sanna Passino [email protected] Nicholas A. Heard [email protected]

1

Department of Mathematics, Imperial College London, 180 Queen’s Gate, London SW7 2AZ, UK

This work is motivated by important applications in computer network security. In particular, network flow (NetFlow) data are analysed. Network flow (NetFlow) data provide information about Internet Protocol (IP) connections between nodes in a computer network and have been successfully used to monitor network traffic (Hofstede et al. 2014). These data are routinely collected in bulk at internet routers, providing large databases of IP address connectio