Advances in Property-Based Testing for \(\alpha \) Prolog
\(\alpha \) Check is a light-weight property-based testing tool built on top of \(\alpha \) Prolog, a logic programming language based on nominal logic. \(\alpha \) Prolog is particularly suited to the validation of the meta-theory of formal systems, for
- PDF / 442,670 Bytes
- 20 Pages / 439.37 x 666.142 pts Page_size
- 29 Downloads / 196 Views
University of Edinburgh, Edinburgh, UK [email protected] 2 Universit` a degli Studi di Milano, Milan, Italy [email protected], [email protected]
Abstract. αCheck is a light-weight property-based testing tool built on top of αProlog, a logic programming language based on nominal logic. αProlog is particularly suited to the validation of the meta-theory of formal systems, for example correctness of compiler translations involving name-binding, alpha-equivalence and capture-avoiding substitution. In this paper we describe an alternative to the negation elimination algorithm underlying αCheck that substantially improves its effectiveness. To substantiate this claim we compare the checker performances w.r.t. two of its main competitors in the logical framework niche, namely the QuickCheck/Nitpick combination offered by Isabelle/HOL and the random testing facility in PLT-Redex.
1
Introduction
Formal compiler verification has come a long way from McCarthy and Painter’s “Correctness of a Compiler for Arithmetic Expression” (1967), as witnessed by the success of CompCert and subsequent projects [21,35]. However outstanding these achievements are, they are not a magic wand for every-day compiler writers: not only CompCert was designed with verification in mind, whereby the implementation and the verification were a single process, but there are only a few dozen people in the world able and willing to carry out such an endeavour. By verification, CompCert means the preservation of certain simulation relations between source, intermediate and target code; however, the translations involved are relatively simple compared to those employed by modern optimizing compilers. Despite some initial work [1,7], handling more realistic optimizations seems even harder, e.g. the verification of the call arity analysis and transformation in the Glasgow Haskell Compiler (GHC): “The [Nominal] Isabelle development corresponding to this paper, including the definition of the syntax and the semantics, contains roughly 12,000 lines of code with 1,200 lemmas (many small, some large) in 75 theories, created over the course of 9 months” (page 11, [7]).
For the rest of us, hence, it is back to compiler testing, which is basically synonymous with passing a hand-written fixed validation suite. This is not completely satisfactory, as the coverage of those tests is difficult to assess and c Springer International Publishing Switzerland 2016 B.K. Aichernig and C.A. Furia (Eds.): TAP 2016, LNCS 9762, pp. 37–56, 2016. DOI: 10.1007/978-3-319-41135-4 3
38
J. Cheney et al.
because, being fixed, these suites will not uncover new bugs. In the last few years, randomized differential testing [24] has been suggested in combination with automatic generation of (expressive) test programs, most notably for C compilers with the Csmith tool [36] and to a lesser extent for GHC [30]. The oracle is comparison checking: Csmith feeds randomly generated programs to several compilers and flags the minority one(s), that is, those reporting different outputs f
Data Loading...