MS-NET: modular selective network

PDF / 3,746,553 Bytes
19 Pages / 595.276 x 790.866 pts Page_size
62 Downloads / 267 Views

ORIGINAL ARTICLE

MS‑NET: modular selective network Round robin based modular neural network architecture with limited redundancy Intisar Md Chowdhury1 · Kai Su1 · Qiangfu Zhao1 Received: 29 March 2020 / Accepted: 17 September 2020 © The Author(s) 2020

Abstract We propose a modular architecture of Deep Neural Network (DNN) for multi-class classification task. The architecture consists of two parts, a router network and a set of expert networks. In this architecture, for a C-class classification problem, we have exactly C experts. The backbone network for these experts and the router are built with simple and identical DNN architecture. For each class, the modular network has a certain number 𝜌 of expert networks specializing in that particular class, where 𝜌 is called the redundancy rate in this study. We demonstrate that 𝜌 plays a vital role in the performance of the network. Although these experts are light weight and weak learners alone, together they match the performance of more complex DNNs. We train the network in two phase wherein, first the router is trained on the whole set of training data followed by training each expert network enforced by a new stochastic objective function that facilitates alternative training on a small subset of expert data and the whole set of data. This alternative training provides an additional form of regularization and avoids over-fitting the expert network on subset data. During the testing phase, the router dynamically selects a fixed number of experts for further evaluation of the input datum. The modular nature and low parameter requirement of the network makes it very suitable in distributed and low computational environments. Extensive empirical study and theoretical analysis on CIFAR-10, CIFAR-100 and F-MNIST substantiate the effectiveness and efficiency of our proposed modular network. Keywords Modular neural networks · Deep learning · Knowledge-distillation · Multi-class classification · Image classification

1 Introduction Deep Neural Networks (DNNs) in the last two decades have shown it’s superiority in the field of visual object recognition [26, 62, 64]; image segmentation [5, 9, 63, 76]; speech recognition and translation [3, 29]; natural language processing [11, 68]; reinforcement learning [43, 56, 57]; bio informatics [63]; educations [38, 73]; and so on. Despite their simple layered structures of neurons and connections, they * Intisar Md Chowdhury d8211106@u‑aizu.ac.jp Kai Su m5232109@u‑aizu.ac.jp Qiangfu Zhao qf‑zhao@u‑aizu.ac.jp 1

System Intelligence Laboratory, The University of Aizu, Aizu‑Wakamatsu 965‑8580, Japan

have outperformed other machine learning models [74]. This superiority has been achieved due to its ability of complex non-linear mapping from input to output, automated rich and discriminate features learning as opposed to hand-engineered low-level features such as GABOR features [41], local binary patterns [2], SIFT [54] and so on. With the passage of time, we can notice that not only the performance is levitating dramatically, als

Data Loading...

MS-NET: modular selective network

Recommend Documents

Network intrusion detection using multi-architectural modular deep neural network

Epidemic spreading and control strategies in spatial modular network

A Modular Network Architecture Resolving Memory Interference Through Inhibition

Modular Design

Modular Units

Modular Robotics

Modular Forms

Modular Interfaces

Modular Programming

Arithmetic on Modular Curves

Modular Automorphism Groups

Introduction to Modular Forms