Classification score approach for detecting adversarial example in deep neural network

PDF / 2,954,903 Bytes
22 Pages / 439.642 x 666.49 pts Page_size
78 Downloads / 210 Views

Classiﬁcation score approach for detecting adversarial example in deep neural network Hyun Kwon1 · Yongchul Kim1 · Hyunsoo Yoon2 · Daeseon Choi3 Received: 17 April 2019 / Revised: 8 January 2020 / Accepted: 4 June 2020 / © The Author(s) 2020

Abstract Deep neural networks (DNNs) provide superior performance on machine learning tasks such as image recognition, speech recognition, pattern analysis, and intrusion detection. However, an adversarial example, created by adding a little noise to an original sample, can cause misclassification by a DNN. This is a serious threat to the DNN because the added noise is not detected by the human eye. For example, if an attacker modifies a right-turn sign so that it misleads to the left, autonomous vehicles with the DNN will incorrectly classify the modified sign as pointing to the left, but a person will correctly classify the modified sign as pointing to the right. Studies are under way to defend against such adversarial examples. The existing method of defense against adversarial examples requires an additional process such as changing the classifier or modifying input data. In this paper, we propose a new method for detecting adversarial examples that does not invoke any additional process. The proposed scheme can detect adversarial examples by using a pattern feature of the classification scores of adversarial examples. We used MNIST and CIFAR10 as experimental datasets and Tensorflow as a machine learning library. The experimental results show that the proposed method can detect adversarial examples with success rates: 99.05% and 99.9% for the untargeted and targeted cases in MNIST, respectively, and 94.7% and 95.8% for the untargeted and targeted cases in CIFAR10, respectively. Keywords Deep neural network · Evasion attack · Adversarial example · Machine learning · Detection method · Classification score. Daeseon Choi

[email protected] Hyun Kwon [email protected] 1

Department of Electrical Engineering, Korea Military Academy, 574 Hwarang-ro, Nowon-gu, Seoul 01819, Republic of Korea

2

School of Computing, Korea Advanced Institute of Science and Technology (KAIST), 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea

3

Department of Software, Soongsil University, 369 Sangdo-ro, Dongjak-gu, Seoul, Republic of Korea

Multimedia Tools and Applications

1 Introduction Deep neural networks (DNNs) [26] provide excellent service on machine learning tasks such as image recognition [28], speech recognition [10, 11], pattern analysis [4], and intrusion detection [24]. However, DNNs are vulnerable to adversarial examples [29, 33], in which a little noise has been added to an original sample. An adversarial example can cause misclassification by the DNN, but humans cannot detect a difference between the adversarial example and the original sample. For instance, if an attacker generates a modified left-turn sign so that it will be incorrectly categorized by a DNN, the autonomous vehicle with the DNN will incorrectly classify the modified sign as pointing to the right, b

Data Loading...

Classification score approach for detecting adversarial example in deep neural network

Recommend Documents

Fruit Classification Through Deep Learning: A Convolutional Neural Network Approach

Deep Convolutional Neural Network for Microseismic Signal Detection and Classification

Deep Convolutional Neural Network for Remote Sensing Scene Classification

Performance Evaluation of Adversarial Examples on Deep Neural Network Architectures

Dual Adversarial Network for Deep Active Learning

Hardening Deep Neural Networks in Condition Monitoring Systems against Adversarial Example Attacks

Classification of Car Parts Using Deep Neural Network

Deep Neural Networks for Landmines Images Classification

Deep Neural Networks for Supervised Learning: Classification

Explainable AI for Inspecting Adversarial Attacks on Deep Neural Networks

Deep convolutional network for urbansound classification

Hierarchical Deep Neural Network for Image Captioning