Transferring and Compressing Convolutional Neural Networks for Face Representations

In this work we have investigated face verification based on deep representations from Convolutional Neural Networks (CNNs) to find an accurate and compact face descriptor trained only on a restricted amount of face image data. Transfer learning by fine-t

PDF / 277,371 Bytes
10 Pages / 439.37 x 666.142 pts Page_size
69 Downloads / 215 Views

DOWNLOAD

REPORT

entre for Mathematical Sciences, Lund University, Lund, Sweden [email protected], [email protected] 2 Axis Communications, Lund, Sweden {jiandan.chen,martin.ljungqvist}@axis.com

Abstract. In this work we have investigated face veriﬁcation based on deep representations from Convolutional Neural Networks (CNNs) to ﬁnd an accurate and compact face descriptor trained only on a restricted amount of face image data. Transfer learning by ﬁne-tuning CNNs pretrained on large-scale object recognition has been shown to be a suitable approach to counter a limited amount of target domain data. Using model compression we reduced the model complexity without signiﬁcant loss in accuracy and made the feature extraction more feasible for realtime use and deployment on embedded systems and mobile devices. The compression resulted in a 9-fold reduction in number of parameters and a 5-fold speed-up in the average feature extraction time running on a desktop CPU. With continued training of the compressed model using a Siamese Network setup, it outperformed the larger model.

1

Introduction

In visual recognition it is rapidly becoming a standard practice to use deep representations composed of layer activations extracted from Convolutional Neural Networks (CNNs) as object descriptors, see [1,17]. CNNs are frequent top performers on complex image analysis tasks. However, one of the drawbacks of CNNs is that they require vast amounts of data for training in order to perform well. The CNNs used for this purpose are therefore often pre-trained on huge labeled datasets for generic object recognition containing a large set of object categories, from here on we call those CNNs generic CNNs. Generic CNNs, such as [13,19], can be regarded as general-purpose feature extractors producing generic object descriptors, descriptors that may also constitute good representations for domains other than the source domain. Even though a generic CNN usually perform well in domains other than those it was trained for, it still lacks speciﬁcity. In many cases the object representations can be further improved by adapting the CNN to the target domain, as done in [1] and which led to state of the art results on 16 visual recognition benchmarks. c Springer International Publishing Switzerland 2016 A. Campilho and F. Karray (Eds.): ICIAR 2016, LNCS 9730, pp. 20–29, 2016. DOI: 10.1007/978-3-319-41501-7 3

Transferring and Compressing Convolutional Neural Networks

21

The process of transferring a generic CNN to a new data domain is often called ﬁne-tuning and is a way to do transfer learning. Fine-tuning involves training a CNN structure initialized with weights from the pre-trained generic CNN and using data from the target domain. To recognise subjects in images of arbitrary angle, position, lighting and other variables is a complex task which requires large CNNs with many layers for training. To evaluate a trained CNN model on unseen data the entire CNN structure is needed. This is much more time eﬃcient than training. However, a real-time application

Data Loading...

Transferring and Compressing Convolutional Neural Networks for Face Representations

Recommend Documents

Compressing and Interpreting SOM-Based Convolutional Neural Networks

Re-Training and Parameter Sharing with the Hash Trick for Compressing Convolutional Neural Networks

Face Recognition Based on Harris Detector and Convolutional Neural Networks

Convolutional Neural Networks

Convolutional Neural Networks for Clothes Categories

Convolutional Neural Networks for Traffic Signs Recognition

Deep Convolutional Neural Network for Real and Fake Face Discrimination

Convolutional Neural Networks and Texture Classification

First-person activity recognition from micro-action representations using convolutional neural networks and object flow

Learning flat representations with artificial neural networks

Deep Learning and Convolutional Neural Networks for Medical Image Computing

Convolutional Neural Networks for Real-Time and Wireless Damage Detection