Interface protocol inference to aid understanding legacy software components

  • PDF / 2,216,357 Bytes
  • 22 Pages / 595.276 x 790.866 pts Page_size
  • 83 Downloads / 160 Views

DOWNLOAD

REPORT


THEME SECTION PAPER

Interface protocol inference to aid understanding legacy software components Kousar Aslam1 · Loek Cleophas1 · Ramon Schiffelers1,2 · Mark van den Brand1 Received: 22 June 2019 / Revised: 28 May 2020 / Accepted: 31 May 2020 © The Author(s) 2020

Abstract High-tech companies are struggling today with the maintenance of legacy software. Legacy software is vital to many organizations as it contains the important business logic. To facilitate maintenance of legacy software, a comprehensive understanding of the software’s behavior is essential. In terms of component-based software engineering, it is necessary to completely understand the behavior of components in relation to their interfaces, i.e., their interface protocols, and to preserve this behavior during the maintenance activities of the components. For this purpose, we present an approach to infer the interface protocols of software components from the behavioral models of those components, learned by a blackbox technique called active (automata) learning. To validate the learned results, we applied our approach to the software components developed with model-based engineering so that equivalence can be checked between the learned models and the reference models, ensuring the behavioral relations are preserved. Experimenting with components having reference models and performing equivalence checking builds confidence that applying active learning technique to reverse engineer legacy software components, for which no reference models are available, will also yield correct results. To apply our approach in practice, we present an automated framework for conducting active learning on a large set of components and deriving their interface protocols. Using the framework, we validated our methodology by applying active learning on 202 industrial software components, out of which, interface protocols could be successfully derived for 156 components within our given time bound of 1 h for each component. Keywords Active automata learning · Interface protocols · Learning framework · Equivalence oracles

1 Introduction Large-scale software systems are inherently complex, with complexity caused by a large number of constituent components and the interactions between them [54]. The software also changes over time due to maintenance as a result Communicated by Federico Ciccozzi, Antonio Cicchetti and Andreas Wortmann.

B

Kousar Aslam [email protected] Loek Cleophas [email protected] Ramon Schiffelers [email protected]; [email protected] Mark van den Brand [email protected]

1

Eindhoven University of Technology, Eindhoven, The Netherlands

2

ASML, Veldhoven, The Netherlands

of evolving requirements, emerging technology trends and hardware changes [38]. To deal with this ever-increasing complexity of high-tech software systems, different software development methodologies emerged with the goal of raising the abstraction level of software development. In the late 1960s, component-based software engineering (CBSE) [9] started becomin