Effective node selection technique towards sparse learning
- PDF / 1,340,530 Bytes
- 13 Pages / 595.276 x 790.866 pts Page_size
- 78 Downloads / 273 Views
Effective node selection technique towards sparse learning Bunyodbek Ibrokhimov 1 & Cheonghwan Hur 1 & Sanggil Kang 1
# Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract Neural networks are getting wider and deeper to achieve state-of-the-art results in various machine learning domains. Such networks result in complex structures, high model size, and computational costs. Moreover, these networks are failing to adapt to new data due to their isolation in the specific domain-target space. To tackle these issues, we propose a sparse learning method to train the existing network on new classes by selecting non-crucial parameters from the network. Sparse learning also manages to keep the performance of existing classes with no additional network structure and memory costs by employing an effective node selection technique, which analyzes and selects unimportant parameters by using information theory in the neuron distribution of the fully connected layers. Our method could learn up to 40% novel classes without notable loss in the accuracy of existing classes. Through experiments, we show how a sparse learning method competes with state-of-the-art methods in terms of accuracy and even surpasses the performance of related algorithms in terms of efficiency in memory, processing speed, and overall training time. Importantly, our method can be implemented in both small and large applications, and we justify this by using well-known networks such as LeNet, AlexNet, and VGG-16. Keywords Optimization . Sparse learning . Node selection . Neuron activation . Information theory . Convolutional neural network
1 Introduction Neural networks have been widely employed in many applications, making tremendous progress in different machine learning domains. The advancement in Convolutional Neural Networks (CNNs) made it possible to achieve impressive performances on difficult visual tasks in computer vision such as object detection [1–3], image captioning [4], semantic segmentation [5–7], surface normal estimation [8–10], image recognition and classification [11–13]. CNNs have accomplished state-of-the-art performances, matching or even excelling human performance in some domains due to the development of deep networks like AlexNet [15], VGGNet [16], GoogLeNet [17], and ResNet [11]. Despite these successes, machine learning algorithms are traditionally designed to work in isolation. In other words, these algorithms are trained in a particular domain or even on a specific dataset. Models have to be rebuilt or retrained from scratch once the feature-space distribution * Sanggil Kang [email protected] 1
Department of Computer Engineering, Inha University, Inha-ro 100, Incheon, Nam-gu 22212, South Korea
or a given dataset changes. To solve the problem, transfer learning is proposed, which is an efficient learning method to solve new tasks by using the knowledge of the pretrained models. In transfer learning, one can transfer features, weights, or instances from previously trained models to tackle problems such
Data Loading...