Traffic Dynamics-Aware Probe Selection for Fault Detection in Networks

  • PDF / 2,916,837 Bytes
  • 30 Pages / 439.37 x 666.142 pts Page_size
  • 51 Downloads / 183 Views

DOWNLOAD

REPORT


Traffic Dynamics‑Aware Probe Selection for Fault Detection in Networks Anjua Tayal1 · Neha Sharma1 · Neminath Hubballi1   · Maitreya Natu2 Received: 5 July 2019 / Revised: 21 January 2020 / Accepted: 22 January 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Fault detection in modern networks is done with a set of specially instrumented nodes which send probes to find faults. These probes generate additional traffic in network and compete with other regular traffic for bandwidth. In this paper we consider the problem of dynamically adapting the probes based on traffic dynamics experienced by nodes. We propose to profile the links and nodes to get aggregate I/O statistics in a time window and use it as an instantaneous measure of congestion. We consider the network with I/O statistics to generate a weighted graph and formulate an optimization problem to find a set of probes covering whole network with minimum weight. By showing that finding minimum weight probes maps to a known NP complete problem, we propose three greedy algorithms for selecting probes. With both simulation and real graphs of Internet Service Provider (ISP) networks, we perform five sets of experiments and show that proposed algorithms can dynamically adapt to changes in traffic dynamics and also can select probes in large networks in reasonable time. Keywords  Active probing · Candidate probes · Network congestion

* Neminath Hubballi [email protected] Anjua Tayal [email protected] Neha Sharma [email protected] Maitreya Natu [email protected] 1

Discipline of Computer Science and Engineering, Indian Institute of Technology Indore, Indore, India

2

Tata Research Development and Design Centre Pune, Pune, India



13

Vol.:(0123456789)



Journal of Network and Systems Management

1 Introduction Effective network management involves addressing two important issues, failure detection and meeting Quality of Service (QoS) demands of applications and users. Failure detection and hence recovery from failures quickly help in guaranteeing uptime. The task of detecting and recovering from failures in networks is becoming challenging as modern networks are growing in size and complexity. Recent works [1] have proposed to automate this task of failure detection. In order to detect and diagnose failures in networks, regular health information is collected from network components. There are two types of fault detection techniques as Monitoring and Probing. In Monitoring, agents are deployed at all the nodes in network for collecting regular health information. These monitoring agents generate more detailed information about the components (nodes and links) being monitored which is often overwhelming for processing. In Probing, specialized nodes known as Probe-stations send a set of test transactions to other network components and based on the response received failures are identified. To probe all the network components these Probe-stations are placed strategically in the network such that with minimum n