Modelling the Effects of Combining Diverse Software Fault Detection Techniques

The software engineering literature contains many studies of the efficacy of fault finding techniques. Few of these, however, consider what happens when several different techniques are used together. We show that the effectiveness of such multi-technique

  • PDF / 312,667 Bytes
  • 22 Pages / 430 x 660 pts Page_size
  • 14 Downloads / 175 Views

DOWNLOAD

REPORT


Centre for Software Reliability, City University, Northampton Square, London EC1V 0HB, UK {b.littlewood,ptp,strigini}@csr.city.ac.uk http://www.csr.city.ac.uk 2 School of Psychological Sciences, University of Manchester, Manchester M13 9PL, UK [email protected] Abstract. The software engineering literature contains many studies of the efficacy of fault finding techniques. Few of these, however, consider what happens when several different techniques are used together. We show that the effectiveness of such multi-technique approaches depends upon quite subtle interplay between their individual efficacies and dependence between them. The modelling tool we use to study this problem is closely related to earlier work on software design diversity. The earliest of these results showed that, under quite plausible assumptions, it would be unreasonable even to expect software versions that were developed ‘truly independently’ to fail independently of one another. The key idea here was a ‘difficulty function’ over the input space. Later work extended these ideas to introduce a notion of ‘forced’ diversity, in which it became possible to obtain system failure behaviour better even than could be expected if the versions failed independently. In this paper we show that many of these results for design diversity have counterparts in diverse fault detection in a single software version. We define measures of fault finding effectiveness, and of diversity, and show how these might be used to give guidance for the optimal application of different fault finding procedures to a particular program. We show that the effects upon reliability of repeated applications of a particular fault finding procedure are not statistically independent – in fact such an incorrect assumption of independence will always give results that are too optimistic. For diverse fault finding procedures, on the other hand, things are different: here it is possible for effectiveness to be even greater than it would be under an assumption of statistical independence. We show that diversity of fault finding procedures is, in a precisely defined way, ‘a good thing’, and should be applied as widely as possible. The new model and its results are illustrated using some data from an experimental investigation into diverse fault finding on a railway signalling application.

1 Introduction Diversity is ubiquitous in human activity. In quite mundane contexts it is common to use diversity to improve confidence: for example, I might ask a colleague to check my *

Work performed while this author was at the Department of Psychology, University of Hull, Hull HU6 7RX, UK.

R.M. Hierons et al. (Eds.): Formal Methods and Testing, LNCS 4949, pp. 345–366, 2008. © 2000 IEEE. Reprinted, with permission, from IEEE Trans. Software Engineering, vol. 26(12), pp. 1157–1167, 2000

346

B. Littlewood et al.

arithmetic in a complex calculation. The informal idea is that the mistakes he might make will differ from those that I might make, and our arriving at the same answer suggests