Self-paced hybrid dilated convolutional neural networks

  • PDF / 1,151,233 Bytes
  • 13 Pages / 439.642 x 666.49 pts Page_size
  • 43 Downloads / 267 Views

DOWNLOAD

REPORT


Self-paced hybrid dilated convolutional neural networks Wenzhen Zhang1 · Guangquan Lu1 · Shichao Zhang1 · Yonggang Li1 Received: 2 March 2020 / Revised: 12 August 2020 / Accepted: 11 September 2020 / © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Convolutional neural networks (CNNs) can learn the features of samples by supervised manner, and obtain outstanding achievements in many application fields. In order to improve the performance and generalization of CNNs, we propose a self-learning hybrid dilated convolution neural network (SPHDCNN), which can choose relatively reliable samples according to the current learning ability during training. In order to avoid the loss of useful feature map information caused by pooling, we introduce hybrid dilated convolution. In the proposed SPHDCNN, weight is applied to each sample to reflect the easiness of the sample. SPHDCNN employs easier samples for training first, and then adds more difficulty samples gradually according to the current learning ability. It gradually improves its performance by this learning mechanism. Experimental results show SPHDCNN has strong generalization ability, and it achieves more advanced performance compared to the baseline method. Keywords Convolutional neural networks (CNNs) · Self-paced learning(SPL) · Hybrid dilated convolution(HDC)

1 Introduction In recent years, deep learning has attracted wide interest in areas such as computer vision, natural language processing, and speech recognition. Deep learning has great success in  Shichao Zhang

[email protected] Guangquan Lu [email protected] Wenzhen Zhang [email protected] Yonggang Li [email protected] 1

Guangxi Key Lab of Multi-Source Information Mining and Security, Guangxi Normal University, Guilin, Guangxi, 541004, China

Multimedia Tools and Applications

speech and image recognition applications, and it can get more superior performance than previous machine learning related technologies. Deep learning gets the correspondence between the input data and its real labels through a deep neural network (DNN) which consists of multiple hidden layers and non-linear activation functions. Convolutional neural networks (CNNs) [10] are widely used in image classification such as handwriting and object recognition. Generally speaking, CNNs mainly include convolution kernel, nonlinear activation functions and pooling operations. The convolution kernel can be considered as a specific feature detector. In order to establish a non-linear mapping relationship between the input and output, convolution kernel performs convolution processing on the input data, and then exerts an influence on the output of the convolution through a non-linear activation function. Pooling is essentially downsampling. Inserting the pooling layer in the network can continuously reduce the space size of the data, expand the receiving range. In CNNs model, the stochastic gradient descent with error back propagation algorithm is generally used to adjust the model parameters.