A study of learning likely data structure properties using machine learning models

PDF / 917,586 Bytes
15 Pages / 595.276 x 790.866 pts Page_size
65 Downloads / 237 Views

STTT Special Issue: SPIN 2019

A study of learning likely data structure properties using machine learning models Muhammad Usman1 · Wenxi Wang1 · Kaiyuan Wang1 · Cagdas Yelen1 · Nima Dini1 · Sarfraz Khurshid1

© Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract Data structure properties are important for many testing and analysis tasks. For example, model checkers use these properties to find program faults. These properties are often written manually which can be error prone and lead to false alarms. This paper presents the results of controlled experiments performed using existing machine learning (ML) models on various data structures. These data structures are dynamic and reside on the program heap. We use ten data structure subjects and ten ML models to evaluate the learnability of data structure properties. The study reveals five key findings. One, most of the ML models perform well in learning data structure properties, but some of the ML models such as quadratic discriminant analysis and Gaussian naive Bayes are not suitable for learning data structure properties. Two, most of the ML models have high performance even when trained on just 1% of data samples. Three, certain data structure properties such as binary heap and red black tree are more learnable than others. Four, there are no significant differences between the learnability of varied-size (i.e., up to a certain size) and fixed-size data structures. Five, there can be significant differences in performance based on the encoding used. These findings show that using machine learning models to learn data structure properties is very promising. We believe that these properties, once learned, can be used to provide a run-time check to see whether a program state at a particular point satisfies the learned property. Learned properties can also be employed in the future to automate static and dynamic analysis, which would enhance software testing and verification techniques. Keywords Data structure invariants · Machine learning · Korat · Learnability

1 Introduction Data structure invariants play an extensive role in the verification and validation of software systems. These are properties that data structures should satisfy, and are also termed as

B

Muhammad Usman muhammadusman@utexas.edu Wenxi Wang wenxiw@utexas.edu Kaiyuan Wang kaiyuanw@utexas.edu Cagdas Yelen cagdas@utexas.edu Nima Dini nima.dini@utexas.edu Sarfraz Khurshid khurshid@utexas.edu

1

University of Texas at Austin, Austin, TX 78712, USA

class invariants in object-oriented programming [39,45]. For example, a binary tree should satisfy properties such that there is no cycle in the tree and that all nodes are reachable from the root node. We believe that data structure invariants play a vital role in the field of software testing. Model checkers usually use these invariants as assertions and try to violate these invariant properties in order to find bugs in programs. If an invariant is violated, it exposes a bug in the software system [25,44,67]. Data structure repair t

Data Loading...

A study of learning likely data structure properties using machine learning models

Recommend Documents

Machine Learning and Deep Learning Models for Big Data Issues

Evaluating Machine Learning Models

Machine Learning Models

Deploying Machine Learning Models

Investing Data with Machine Learning Using Python

Bilinear Models for Machine Learning

Machine Learning Models and Algorithms for Big Data Classification T

Explaining Machine Learning Models of Emotion Using the BIRAFFE Dataset

Big Data and Machine Learning

Breast Cancer Prediction: A Comparative Study Using Machine Learning Techniques

Ensuring Data Privacy Using Machine Learning for Responsible Data Science

Characterization of Iranian Grapevine Cultivars Using Machine Learning Models