Learning credible DNNs via incorporating prior knowledge and model local explanation

PDF / 1,498,630 Bytes
28 Pages / 439.37 x 666.142 pts Page_size
34 Downloads / 175 Views

Learning credible DNNs via incorporating prior knowledge and model local explanation Mengnan Du1

· Ninghao Liu1 · Fan Yang1 · Xia Hu1

Received: 8 January 2020 / Revised: 25 September 2020 / Accepted: 4 October 2020 © Springer-Verlag London Ltd., part of Springer Nature 2020

Abstract Recent studies have shown that state-of-the-art DNNs are not always credible, despite their impressive performance on the hold-out test set of a variety of tasks. These models tend to exploit dataset shortcuts to make predictions, rather than learn the underlying task. The noncredibility could lead to low generalization, adversarial vulnerability, as well as algorithmic discrimination of the DNN models. In this paper, we propose CREX in order to develop more credible DNNs. The high-level idea of CREX is to encourage DNN models to focus more on evidences that actually matter for the task at hand and to avoid overfitting to data-dependent shortcuts. Specifically, in the DNN training process, CREX directly regularizes the local explanation with expert rationales, i.e., a subset of features highlighted by domain experts as justifications for predictions, to enforce the alignment between local explanations and rationales. Even when rationales are not available, CREX still could be useful by requiring the generated explanations to be sparse. In addition, CREX is widely applicable to different network architectures, including CNN, LSTM and attention model. Experimental results on several text classification datasets demonstrate that CREX could increase the credibility of DNNs. Comprehensive analysis further shows three meaningful improvements of CREX: (1) it significantly increases DNN accuracy on new and previously unseen data beyond test set, (2) it enhances fairness of DNNs in terms of equality of opportunity metric and reduce models’ discrimination toward certain demographic group, and (3) it promotes the robustness of DNN models with respect to adversarial attack. These experimental results highlight the advantages of the increased credibility by CREX. Keywords Deep neural network · Credibility · Prior knowledge · Generalization · Fairness · Adversarial

B

Mengnan Du [email protected] Ninghao Liu [email protected] Fan Yang [email protected] Xia Hu [email protected]

1

Department of Computer Science and Engineering, Texas A&M University, College Station, USA

123

M. Du et al.

1 Introduction Deep neural networks (DNNs) have achieved super-human performance in many applications, including complex vision tasks such as object recognition and semantic segmentation [6,16], or high-level language understanding tasks like reading comprehension, question answering and natural language understanding [8,55]. Nevertheless, recent studies show that these DNNs might not be credible [13]. Some of their “success” could be attributed to adopting superficial patterns (or shortcuts) in the data, rather than capturing the underlying generalization. The non-credibility issue has been observed in various DNN systems. The most representative example is perh

Data Loading...

Learning credible DNNs via incorporating prior knowledge and model local explanation

Recommend Documents

A Model-Agnostic Recommendation Explanation System Based on Knowledge Graph

Explanation via Machine Arguing

Single-image super-resolution via local learning

Metric transfer learning via geometric knowledge embedding

Underwater Enhancement Model via Reverse Dark Channel Prior

Explanation Based Learning

Robust Image Registration Based on Learning Prior Appearance Model

Local explanation in historiography of science

Iterative geostatistical seismic inversion incorporating local anisotropies

Violence detection explanation via semantic roles embeddings

New prior distribution for Bayesian neural network and learning via Hamiltonian Monte Carlo

Saliency Region Detection via Graph Model and Statistical Learning