Structural Statistical Software Testing with Active Learning in a Graph
Structural Statistical Software Testing (SSST) exploits the control flow graph of the program being tested to construct test cases. Specifically, SSST exploits the feasible paths in the control flow graph, that is, paths which are actually exerted for som
- PDF / 548,925 Bytes
- 14 Pages / 430 x 660 pts Page_size
- 89 Downloads / 221 Views
tract. Structural Statistical Software Testing (SSST) exploits the control flow graph of the program being tested to construct test cases. Specifically, SSST exploits the feasible paths in the control flow graph, that is, paths which are actually exerted for some values of the program input; the limitation is that feasible paths are massively outnumbered by infeasible ones. Addressing this limitation, this paper presents an active learning algorithm aimed at sampling the feasible paths in the control flow graph. The difficulty comes from both the few feasible paths initially available and the nature of the feasible path concept, reflecting the long-range dependencies among the nodes of the control flow graph. The proposed approach is based on a frugal representation inspired from Parikh maps, and on the identification of the conjunctive subconcepts in the feasible path concept within a Disjunctive Version Space framework. Experimental validation on real-world and artificial problems demonstrates significant improvements compared to the state of the art. Keywords: Structured Sampling, Structured Active Learning, Structural Statistical Software Testing, Disjunctive Version Space, Machine Learning Application to Computer Science.
1
Introduction
Autonomic Computing is becoming a new application domain for Machine Learning (ML), motivated by the increasing complexity of current systems [1]. Ideally, systems should be able to automatically adapt, maintain and repair themselves; a first step to this end is to build self-aware systems, using ML to automatically model the system behavior. Similar trends are observed in the field of software design; various ML approaches have been proposed for Software Testing [2,3], Software Modeling [4] and Software Debugging [5]. Resuming an earlier work [3], this paper is motivated by Statistical Structural Software Testing (SSST) [6]. SSST exploits the control flow graph of the program being tested (Fig. 1) to construct test cases; specifically, test cases are derived from the feasible paths in the control flow graph, that is, paths which are actually exerted for some values of the program input. However, for reasonable size programs there is a huge gap between the syntactical description of the program (the control flow graph) and its semantics (the feasible paths). In practice, H. Blockeel et al. (Eds.): ILP 2007, LNAI 4894, pp. 49–62, 2008. c Springer-Verlag Berlin Heidelberg 2008
50
N. Baskiotis and M. Sebag
the fraction of feasible paths might be as low as 10−5 for small size programs, making it inefficient to uniformly sample the paths in the control flow graph. The characterization of the feasible path region faces several difficulties. First of all, the target concept (i.e. the feasible path region) is non-Markovian: a path is infeasible as it violates some subtle, long-range dependencies among the program nodes. A frugal propositional representation extending Parikh maps [7] was proposed in [3], allowing one to express node dependencies in a compact way. However, using either a relational or
Data Loading...