The Mode-Fisher pooling for time complexity optimization in deep convolutional neural networks
- PDF / 2,387,201 Bytes
- 23 Pages / 595.276 x 790.866 pts Page_size
- 91 Downloads / 163 Views
(0123456789().,-volV)(0123456789(). ,- volV)
ORIGINAL ARTICLE
The Mode-Fisher pooling for time complexity optimization in deep convolutional neural networks Dou El Kefel Mansouri1,2
•
Bachir Kaddar1 • Seif-Eddine Benkabou2 • Khalid Benabdeslem2
Received: 21 February 2020 / Accepted: 29 September 2020 Ó Springer-Verlag London Ltd., part of Springer Nature 2020
Abstract In this paper, we aim to improve the performance, time complexity and energy efficiency of deep convolutional neural networks (CNNs) by combining hardware and specialization techniques. Since the pooling step represents a process that contributes significantly to CNNs performance improvement, we propose the Mode-Fisher pooling method. This form of pooling can potentially offer a very promising results in terms of improving feature extraction performance. The proposed method reduces significantly the data movement in the CNN and save up to 10% of total energy, without any performance penalty. Keywords Convolutional neural networks CNNs Mode-Fisher pooling Energy
1 Introduction Deep learning using convolutional neural networks (CNNs) [43] became the state-of-the-art solution in a variety of areas such as computer vision tasks and natural language processing [13, 39]. Nevertheless, it has shown limits with regard to time complexity and energy efficiency. This is essentially due to the huge amount of data used and computational operations. Previous work has proposed solutions that reduce the data size and computational operations but always at the cost of performance [31, 59]. Indeed, no solid technique has been able to estimate the energy consumption source of the CNN. In this sense, & Dou El Kefel Mansouri [email protected] Bachir Kaddar [email protected] Seif-Eddine Benkabou [email protected]; [email protected] Khalid Benabdeslem [email protected] 1
University Ibn Khaldoun, BP P 78 zaaˆroura, 14000 Tiaret, Algeria
2
LIAS/ISAE-ENSMA, University of Poitiers, 1, Avenue Clement Ader, Futuroscope Cedex, 86960 Lyon, France
reducing data size or computational operations with non-supervised manner does not necessarily result in energy consumption reduction. For instance, the number of the convolution layer weights in AlexNet [17, 39] is reduced to 95% but the layer consumes 72.6% of the total energy. GoogleNet is pruned to a fewer multiply-and-accumulate (MAC) operations but consumes more energy [31]. According to Hinton and Salakhutdinov [31], even reducing the number of target classes, the energy reduction remains limited. Chen [8] estimate that moving data can consume a lot of energy. If we consider this hypothesis, elimination of reused data for reducing DRAM accesses seems the key to achieving high energy efficiency. We note that energy efficiency means eliminating wasted energy. As for the energy (power) consumption (P), it is a real-time metric of the electrical energy being consumed by a device [32]. For instance, the power consumption of a processor is given
Data Loading...