Weight asynchronous update: Improving the diversity of filters in a deep convolutional network

PDF / 1,497,713 Bytes
12 Pages / 612 x 808 pts Page_size
38 Downloads / 143 Views

Weight asynchronous update: Improving the diversity of ﬁlters in a deep convolutional network Dejun Zhang1 , Linchao He2 , Mengting Luo2 , Zhanya Xu1 ( ), and Fazhi He3 c The Author(s) 2020.

1

Abstract Deep convolutional networks have obtained remarkable achievements on various visual tasks due to their strong ability to learn a variety of features. A welltrained deep convolutional network can be compressed to 20%–40% of its original size by removing ﬁlters that make little contribution, as many overlapping features are generated by redundant ﬁlters. Model compression can reduce the number of unnecessary ﬁlters but does not take advantage of redundant ﬁlters since the training phase is not aﬀected. Modern networks with residual, dense connections and inception blocks are considered to be able to mitigate the overlap in convolutional ﬁlters, but do not necessarily overcome the issue. To do so, we propose a new training strategy, weight asynchronous update, which helps to signiﬁcantly increase the diversity of ﬁlters and enhance the representation ability of the network. The proposed method can be widely applied to diﬀerent convolutional networks without changing the network topology. Our experiments show that the stochastic subset of ﬁlters updated in diﬀerent iterations can signiﬁcantly reduce ﬁlter overlap in convolutional networks. Extensive experiments show that our method yields noteworthy improvements in neural network performance.

Introduction

In the past few years, deep learning methods based on convolutional neural networks (CNNs) have obtained signiﬁcant achievements in machine vision [1, 2], shape representation [3–5], speech recognition [6, 7], natural language processing [8–10], etc. In particular, many advanced deep convolutional networks have been proposed to handle visual tasks. For example, the success of deep residual nets has inspired researchers to explore deeper, wider, and more complex frameworks [11, 12]. Deep convolutional networks possess strong learning capability owing to their rich sets of parameters. However, at times, the number of parameters can be excessive, which leads to overlapping and redundant features. It also causes overﬁtting to the training set and a lack of generalization to new data. Several modern networks, which have hundreds of layers (e.g., ResNet [13], DenseNet [11], and Inception [14]), employ an architectural approach to alleviate the above problems. One key idea is that residual connections in early layers and feature fusion can be considered to add noise in the feature space, which regularizes the network, and hence reduce the overlap of learned deep features. A trained network may be further compressed by pruning, quantization, or binarization, which typically exploits the redundancy in the weights of the trained network. In general, the purpose of model compression, instead of optimizing the capacity of networks in training, is to minimize the memory requirements and to accelerate the speed of inference without degrading performance. Exploring the best perf

Data Loading...

Weight asynchronous update: Improving the diversity of filters in a deep convolutional network

Recommend Documents

Deep convolutional network for urbansound classification

TRADI: Tracking Deep Neural Network Weight Distributions

Fruit Classification Through Deep Learning: A Convolutional Neural Network Approach

Deep convolutional neural network application to classify the ECG arrhythmia

Improving the Stability of a Convolutional Neural Network Time-Series Classifier Using SeLU and Tanh

Apple Defect Detection Based on Deep Convolutional Neural Network

Deep Convolutional Neural Network for Microseismic Signal Detection and Classification

Bay Number Recognition Based on Deep Convolutional Recurrent Neural Network

Deep Convolutional Neural Network for Remote Sensing Scene Classification

A Network Intrusion Detection Method Based on Deep Multi-scale Convolutional Neural Network

DCNN-IDS: Deep Convolutional Neural Network Based Intrusion Detection System

Towards improving the convolutional neural networks for deep learning using the distributed artificial bee colony method