An epistemic approach to the formal specification of statistical machine learning
- PDF / 816,483 Bytes
- 18 Pages / 595.276 x 790.866 pts Page_size
- 47 Downloads / 227 Views
SPECIAL SECTION PAPER
An epistemic approach to the formal specification of statistical machine learning Yusuke Kawamoto1 Received: 2 March 2020 / Revised: 7 June 2020 / Accepted: 14 August 2020 © The Author(s) 2020
Abstract We propose an epistemic approach to formalizing statistical properties of machine learning. Specifically, we introduce a formal model for supervised learning based on a Kripke model where each possible world corresponds to a possible dataset and modal operators are interpreted as transformation and testing on datasets. Then, we formalize various notions of the classification performance, robustness, and fairness of statistical classifiers by using our extension of statistical epistemic logic. In this formalization, we show relationships among properties of classifiers, and relevance between classification performance and robustness. As far as we know, this is the first work that uses epistemic models and logical formulas to express statistical properties of machine learning, and would be a starting point to develop theories of formal specification of machine learning. Keywords Modal logic · Possible world semantics · Machine learning · Classification performance · Robustness · Fairness
1 Introduction With the increasing use of machine learning in real-life applications, the safety and security of learning-based systems have been of great interest. In particular, many recent studies [12,40] have found vulnerabilities on the robustness of deep neural networks (DNNs) to malicious inputs, which can lead to disasters in security critical systems, such as self-driving cars. To find out these vulnerabilities in advance, there have been researches on the formal verification and testing methods for the robustness of DNNs in recent years [23,26,35,41]. However, relatively little attention has been paid to the formal specification of machine learning [38]. In the research filed of formal specification and verification, logical approaches have been shown useful to characterize desired properties and to develop theories to disCommunicated by Gwen Salaün and Peter Csaba Ölveczky. This work was supported by the New Energy and Industrial Technology Development Organization (NEDO), by ERATO HASUO Metamathematics for Systems Design Project (No. JPMJER1603), JST, and by Inria under the project LOGIS.
B 1
Yusuke Kawamoto [email protected] AIST, Tsukuba, Japan
cuss those properties. For example, temporal logic [36] is a branch of modal logic for expressing time-dependent propositions, and has been widely used to describe requirements of hardware and software systems. For another example, epistemic logic [44] is a modal logic for knowledge and belief that has been employed as formal policy languages for distributed systems (e.g., for the authentication [8] and the anonymity [39] of security protocols). As far as we know, however, no prior work has employed logical formulas to rigorously describe various statistical properties of machine learning, although there are some papers that (often informall
Data Loading...