Relay Backpropagation for Effective Learning of Deep Convolutional Neural Networks

Learning deeper convolutional neural networks has become a tendency in recent years. However, many empirical evidences suggest that performance improvement cannot be attained by simply stacking more layers. In this paper, we consider the issue from an inf

PDF / 1,185,174 Bytes
16 Pages / 439.37 x 666.142 pts Page_size
34 Downloads / 348 Views

DOWNLOAD

REPORT

3

University of Chinese Academy of Sciences, Beijing, China [email protected] 2 University of Oxford, Oxford, UK [email protected] Key Laboratory of Machine Perception (MOE), School of EECS, Peking University, Beijing, China [email protected] 4 Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, China

Abstract. Learning deeper convolutional neural networks has become a tendency in recent years. However, many empirical evidences suggest that performance improvement cannot be attained by simply stacking more layers. In this paper, we consider the issue from an information theoretical perspective, and propose a novel method Relay Backpropagation, which encourages the propagation of eﬀective information through the network in training stage. By virtue of the method, we achieved the first place in ILSVRC 2015 Scene Classification Challenge. Extensive experiments on two large scale challenging datasets demonstrate the eﬀectiveness of our method is not restricted to a speciﬁc dataset or network architecture. Keywords: Relay Backpropagation Large scale image classiﬁcation

1

·

Convolutional neural networks

·

Introduction

Convolutional neural networks (CNNs) are capable of inducing rich features from data, and have been successfully applied in a variety of computer vision tasks. Many breakthroughs obtained in recent years beneﬁt from the advances of convolutional neural networks [2,12,13,24], spurring the research of pursuing a high performing network. The importance of network depth has been revealed in these successes. For example, compared with AlexNet [13], the utilisation of VGGNet [19] brings about substantial gains of accuracy on 1000-class ImageNet 2012 dataset by virtue of deeper architectures. Increasing the depth of network has become a promising way to enhance performance. On the downside, such a solution is in conjunction with the growth of parameter size and model complexity, thus poses great challenges for optimisation. The training of deeper networks typically encounters the risk of divergence c Springer International Publishing AG 2016 B. Leibe et al. (Eds.): ECCV 2016, Part VII, LNCS 9911, pp. 467–482, 2016. DOI: 10.1007/978-3-319-46478-7 29

468

L. Shen et al.

Table 1. Error rates (%) on ImageNet 2012 classiﬁcation and Places2 challenge validation set. VGGNet-22 and VGGNet-25 are obtained by simply adding 3 and 6 layers on VGGNet-19, respectively. Model VGGNet-13 VGGNet-16 VGGNet-19

ImageNet 2012 top-1 err. top-5 err. 28.2 9.6 26.6 8.6 26.9 8.7

Model VGGNet-19 VGGNet-22 VGGNet-25

Places2 challenge top-1 err. top-5 err. 48.5 17.1 48.7 17.2 48.9 17.4

or slower convergence, and is prone to overﬁtting. Besides, there are many empirical evidences [5,19,20] (e.g., the results reported by [19] on ImageNet dataset shown in Table 1 (Left)) to show that the improvement on accuracy cannot be trivially gained by simply adding more layers. It is in accordance with the results in our preliminary experiments on Places2 challenge dataset [29], where deeper networks even suﬀe

Data Loading...

Relay Backpropagation for Effective Learning of Deep Convolutional Neural Networks

Recommend Documents

Deep Learning and Convolutional Neural Networks for Medical Image Computing

Advanced Applied Deep Learning Convolutional Neural Networks and Ob

Sensible Autonomous Machine Using Deep Learning and Convolutional Neural Networks

Taxonomy-Regularized Semantic Deep Convolutional Neural Networks

Deep Neural Networks for Supervised Learning: Classification

Deep Neural Networks for Supervised Learning: Regression

Deep Neural Networks: Incremental Learning

Optimizing Accelerator on FPGA for Deep Convolutional Neural Networks

Towards Effective Classification of Imbalanced Data with Convolutional Neural Networks

Deep Convolutional Neural Networks for Human Embryonic Cell Counting

Comparing Incremental Learning Strategies for Convolutional Neural Networks

An improved model training method for residual convolutional neural networks in deep learning