LayerOut: Freezing Layers in Deep Neural Networks

PDF / 940,193 Bytes
9 Pages / 595.276 x 790.866 pts Page_size
23 Downloads / 266 Views

ORIGINAL RESEARCH

LayerOut: Freezing Layers in Deep Neural Networks Kelam Goutam1 · S. Balasubramanian1 · Darshan Gera1 · R. Raghunatha Sarma1 Received: 16 May 2020 / Accepted: 28 August 2020 © Springer Nature Singapore Pte Ltd 2020

Abstract Deep networks involve a huge amount of computation during the training phase and are prone to over-fitting. To ameliorate these, several conventional techniques such as DropOut, DropConnect, Guided Dropout, Stochastic Depth, and BlockDrop have been proposed. These techniques regularize a neural network by dropping nodes, connections, layers, or blocks within the network. However, these conventional regularization techniques suffers from limitation that, they are suited either for fully connected networks or ResNet-based architectures. In this research, we propose a novel regularization technique LayerOut to train deep neural networks which stochastically freeze the trainable parameters of a layer during an epoch of training. This technique can be applied to both fully connected networks and all types of convolutional networks such as VGG-16, ResNet, etc. Experimental evaluation on multiple dataset including MNIST, CIFAR-10, and CIFAR-100 demonstrates that LayerOut generalizes better than the conventional regularization techniques and additionally reduces the computational burden significantly. We have observed up to 70% reduction in computation per epoch and up to 2 % improvement in classification accuracy as compared to the baseline networks (VGG-16 and ResNet-110) on above datasets. Codes are publically available at https://github.com/Goutam-Kelam/LayerOut. Keywords LayerOut · DropOut · DropConnect · Guided Dropout · Stochastic Depth · BlockDrop

Introduction The recent trend in the deep learning community is to use deeper neural networks (DNN) [8, 27] to solve real-life problems such as image classification [17, 26], language translation [19, 29], object detection [5, 21], speech recognition [2, 6], etc. However, deeper neural networks have been empirically found to display the undesirable characteristic of being prone to over-fitting. Further, the computational load of training a deeper network is by no means trivial. This makes the deployment of deeper models in real-time environment * Kelam Goutam [email protected] S. Balasubramanian [email protected] Darshan Gera [email protected] R. Raghunatha Sarma [email protected] 1

Department of Mathematics and Computer Science, Sri Sathya Sai Institute of Higher Learning, Prashantinilayam, India

such as interactive applications on mobile devices, autonomous driving, etc. a challenging task. In the literature, there are multiple techniques to reduce over-fitting in DNNs such as data augmentation which increases the number of training samples; semi-supervised learning [10] which additionally uses large amount of unsupervised data to train the DNNs; transfer learning [4, 15] which additionally uses models pre-trained on large amount of supervised data. The regularization tech

Data Loading...

LayerOut: Freezing Layers in Deep Neural Networks

Recommend Documents

Deep Neural Networks: Incremental Learning

Automatic Dropout for Deep Neural Networks

A Shallow Introduction to Deep Neural Networks

Deep Neural Networks for Landmines Images Classification

Exploring Internal Representations of Deep Neural Networks

Lossless Compression of Deep Neural Networks

Deep Neural Networks and Phase Reweighting

Image Spam Classification with Deep Neural Networks

Person Name Segmentation with Deep Neural Networks

Habitat mapping using deep neural networks

Deep and Wide Neural Networks Covariance Estimation

Deep Neural Networks for Supervised Learning: Classification