An improved model training method for residual convolutional neural networks in deep learning

  • PDF / 881,531 Bytes
  • 11 Pages / 439.642 x 666.49 pts Page_size
  • 43 Downloads / 244 Views

DOWNLOAD

REPORT


An improved model training method for residual convolutional neural networks in deep learning Xuelei Li1

· Rengang Li2,3 · Yaqian Zhao2,3 · Jian Zhao2,3

Received: 8 April 2020 / Revised: 4 September 2020 / Accepted: 6 October 2020 / © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Residual convolutional neural network (R-CNN) has become a promising method for image recognition in deep learning applications. The application accuracy, as a key indicator, has a close relationship with filter weights in trained R-CNN models. In order to make filters work at full capacity, we find out that lower relevancy between filters in the same layer promotes higher accuracy for R-CNN applications. Furthermore, we propose an improved R-CNN model training method to acquire a higher accuracy and a better generalization ability. In this paper, the main focus is to control the update of filter weights during model training. The key mechanism is achieved through computing the relevancy between filters in the same layer. The relevancy is quantified by a correlation coefficient, e.g., Pearson Correlation Coefficient (PCC). The mechanism takes a larger probability to utilize the updated filter weights with a lower correlation coefficient, and vice versa. In order to validate our proposal, we construct an experiment through PCC on residual networks. The experiment demonstrates that the improved model training method is a promising mean with better generalization ability and higher recognition accuracy (0.52%-1.83%) for residual networks. Keywords Image classification · Residual convolutional neural network · Deep learning · Artificial intelligence

1 Introduction 1.1 Background and problem With the rapid development of artificial intelligence (AI), neural networks play an important role in deep learning. As a promising method in deep learning algorithms, convolution neural network (CNN) has become widely adopted in various computer vision and  Xuelei Li

[email protected] 1

Inspur (Beijing) Electronic Information Industry Co., Ltd, Beijing, 100876, China

2

Inspur Electronic Information Industry Co., Ltd, Jinan, 250101, China

3

State Key Laboratory of High-end Server & Storage Technology, Jinan, 250101, China

Multimedia Tools and Applications

pattern recognition applications, including driver assistance, character recognition and image classification. Specifically, in the field of image classification, CNN has become the trendy algorithm to realize recognition applications. The application accuracy, as a key indicator, has a close relationship with filter weights in trained CNN models. In order to acquire a better image recognition accuracy, there exist a lot of model training methods with good effects. For example, many researches focus on optimizing the accuracy in image classification algorithms by adopting larger-scale networks. However, mobile applications need lightweight networks, e.g., MobileNets [8, 13], which have limited or smaller scale due to the limitation of mobile resources. Typically,