Relay Backpropagation for Effective Learning of Deep Convolutional Neural Networks

Learning deeper convolutional neural networks has become a tendency in recent years. However, many empirical evidences suggest that performance improvement cannot be attained by simply stacking more layers. In this paper, we consider the issue from an inf

  • PDF / 1,185,174 Bytes
  • 16 Pages / 439.37 x 666.142 pts Page_size
  • 34 Downloads / 301 Views

DOWNLOAD

REPORT


3

University of Chinese Academy of Sciences, Beijing, China [email protected] 2 University of Oxford, Oxford, UK [email protected] Key Laboratory of Machine Perception (MOE), School of EECS, Peking University, Beijing, China [email protected] 4 Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, China

Abstract. Learning deeper convolutional neural networks has become a tendency in recent years. However, many empirical evidences suggest that performance improvement cannot be attained by simply stacking more layers. In this paper, we consider the issue from an information theoretical perspective, and propose a novel method Relay Backpropagation, which encourages the propagation of effective information through the network in training stage. By virtue of the method, we achieved the first place in ILSVRC 2015 Scene Classification Challenge. Extensive experiments on two large scale challenging datasets demonstrate the effectiveness of our method is not restricted to a specific dataset or network architecture. Keywords: Relay Backpropagation Large scale image classification

1

·

Convolutional neural networks

·

Introduction

Convolutional neural networks (CNNs) are capable of inducing rich features from data, and have been successfully applied in a variety of computer vision tasks. Many breakthroughs obtained in recent years benefit from the advances of convolutional neural networks [2,12,13,24], spurring the research of pursuing a high performing network. The importance of network depth has been revealed in these successes. For example, compared with AlexNet [13], the utilisation of VGGNet [19] brings about substantial gains of accuracy on 1000-class ImageNet 2012 dataset by virtue of deeper architectures. Increasing the depth of network has become a promising way to enhance performance. On the downside, such a solution is in conjunction with the growth of parameter size and model complexity, thus poses great challenges for optimisation. The training of deeper networks typically encounters the risk of divergence c Springer International Publishing AG 2016  B. Leibe et al. (Eds.): ECCV 2016, Part VII, LNCS 9911, pp. 467–482, 2016. DOI: 10.1007/978-3-319-46478-7 29

468

L. Shen et al.

Table 1. Error rates (%) on ImageNet 2012 classification and Places2 challenge validation set. VGGNet-22 and VGGNet-25 are obtained by simply adding 3 and 6 layers on VGGNet-19, respectively. Model VGGNet-13 VGGNet-16 VGGNet-19

ImageNet 2012 top-1 err. top-5 err. 28.2 9.6 26.6 8.6 26.9 8.7

Model VGGNet-19 VGGNet-22 VGGNet-25

Places2 challenge top-1 err. top-5 err. 48.5 17.1 48.7 17.2 48.9 17.4

or slower convergence, and is prone to overfitting. Besides, there are many empirical evidences [5,19,20] (e.g., the results reported by [19] on ImageNet dataset shown in Table 1 (Left)) to show that the improvement on accuracy cannot be trivially gained by simply adding more layers. It is in accordance with the results in our preliminary experiments on Places2 challenge dataset [29], where deeper networks even suffe