A comprehensive survey on model compression and acceleration

PDF / 2,431,973 Bytes
43 Pages / 439.37 x 666.142 pts Page_size
62 Downloads / 530 Views

A comprehensive survey on model compression and acceleration Tejalal Choudhary1 · Vipul Mishra1 · Anurag Goswami1 · Jagannathan Sarangapani2

© Springer Nature B.V. 2020

Abstract In recent years, machine learning (ML) and deep learning (DL) have shown remarkable improvement in computer vision, natural language processing, stock prediction, forecasting, and audio processing to name a few. The size of the trained DL model is large for these complex tasks, which makes it difficult to deploy on resource-constrained devices. For instance, size of the pre-trained VGG16 model trained on the ImageNet dataset is more than 500 MB. Resource-constrained devices such as mobile phones and internet of things devices have limited memory and less computation power. For real-time applications, the trained models should be deployed on resource-constrained devices. Popular convolutional neural network models have millions of parameters that leads to increase in the size of the trained model. Hence, it becomes essential to compress and accelerate these models before deploying on resource-constrained devices while making the least compromise with the model accuracy. It is a challenging task to retain the same accuracy after compressing the model. To address this challenge, in the last couple of years many researchers have suggested different techniques for model compression and acceleration. In this paper, we have presented a survey of various techniques suggested for compressing and accelerating the ML and DL models. We have also discussed the challenges of the existing techniques and have provided future research directions in the field. Keywords Model compression and acceleration · Machine learning · Deep learning · CNN · RNN · Resource-constrained devices · Efficient neural networks

* Vipul Mishra [email protected] Tejalal Choudhary [email protected] Anurag Goswami [email protected] Jagannathan Sarangapani [email protected] 1

Bennett University, Greater Noida, India

2

Missouri University of Science and Technology, Rolla, MO 65409, USA

13

Vol.:(0123456789)

T. Choudhary et al.

1 Introduction Deep learning is based on an artificial neural network (ANN), and it is part of a broader family of machine learning. A neural network with deeper layers (more than one hidden layer) is known as a deep neural network (DNN). Researchers have seen significant improvement in the accuracy of the DNNs in the last couple of years. In addition, the winning of the ImageNet (Deng et al. 2009) challenge by AlexNet (Krizhevsky et al. 2012) in 2012 also drawn a lot of attention from the artificial intelligence (AI) community. ML and DL have been applied successfully to solve various real-world problems related to text, image, audio and video such as image captioning (Xu et al. 2015), language translation (Sutskever et al. 2014), object detection (Girshick et al. 2014; Ren et al. 2015), speech recognition (Graves and Schmidhuber 2005; Graves et al. 2013), image generation (Goodfellow et al. 2014), classifica

Data Loading...

A comprehensive survey on model compression and acceleration

Recommend Documents

Search-and-Train: Two-Stage Model Compression and Acceleration

A Comprehensive Survey

A Comprehensive Survey on Travel Recommender Systems

Cancelable Biometrics: a comprehensive survey

A Comprehensive Survey on Autonomous Driving Cars: A Perspective View

A Comprehensive Survey on Passive Video Forgery Detection Techniques

A comprehensive survey on Indian regional language processing

Big Data Technologies: A Comprehensive Survey

A comprehensive survey of data mining

Deep Model Compression and Architecture Optimization for Embedded Systems: A Survey

Neural Network Compression and Acceleration by Federated Pruning

A comprehensive survey of Crow Search Algorithm and its applications