Hybrid nonlinear convolution filters for image recognition

  • PDF / 1,988,267 Bytes
  • 11 Pages / 595.224 x 790.955 pts Page_size
  • 15 Downloads / 289 Views

DOWNLOAD

REPORT


Hybrid nonlinear convolution filters for image recognition Xiuling Zhang1 · Kailun Wei1 · Xuenan Kang1 · Jinxiang Li1

© Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Typical convolutional filter only extract features linearly. Although nonlinearities are introduced into the feature extraction layer by using activation functions and pooling operations, they can only provide point-wise nonlinearity. In this paper, a Gaussian convolution for extracting nonlinear features is proposed, and a hybrid nonlinear convolution filter consisting of baseline convolution, Gaussian convolution and other nonlinear convolutions is designed. It can efficiently achieve the fusion of linear features and nonlinear features while preserving the advantages of traditional linear convolution filter in feature extraction. Extensive experiments on the benchmark datasets MNIST, CIFAR10, and CIFAR100 show that the hybrid nonlinear convolutional neural network has faster convergence and higher image recognition accuracy than the traditional baseline convolutional neural network. Keywords Gaussian convolution · Point-wise nonlinearity · Nonlinear features · Image recognition

1 Introduction In the field of computer vision and pattern recognition [1], feature extraction is an important data processing method, and the quality of the extracted features directly affects whether subsequent work can achieve better performance. Convolutional neural network(CNN) is currently the most widely used feature extraction method [2]. Because CNN has the powerful function of representation learning and translation-invariant classification of input information, it has achieved great success in the field of computer vision and pattern recognition, including object recognition [3], object detection [4], image enhancement [5], object tracking [6], semantic segmentation [7], etc. Since AlexNet [8] won the championship with absolute advantage in the 2012 ImageNet competition, research on deep learning has sprung up. In recent years, various effective network structures were proposed and these models perform well in both computer vision and pattern recognition. Although convolution has an overwhelming advantage in extracting the characteristics of image data, it still has limitations [9]. In recent  Xiuling Zhang

[email protected] 1

Key Laboratory of Industrial Computer Control Engineering of Hebei Province, Yanshan University, Qinhuangdao 066004, China

years, neuroscience research has shown that most cells in the striatum cortex(part of the visual cortex involved in processing visual information) can be divided into simple, complex and ultra-complex with specific response characteristics [10]. However, typical convolutional layer cannot express complex response properties in the striatum cortex due to it is a linear system for affine transformation [11]. Although nonlinearities are introduced into the feature extraction layer by using activation functions and pooling operations, they can only provide point-wise nonlinearity [12]. Exis