Bayesian neural networks at scale: a performance analysis and pruning study

PDF / 3,098,074 Bytes
29 Pages / 439.37 x 666.142 pts Page_size
10 Downloads / 152 Views

Bayesian neural networks at scale: a performance analysis and pruning study Himanshu Sharma1 · Elise Jennings1

© This is a U.S. government work and not under copyright protection in the U.S.; foreign copyright protection may apply 2020

Abstract Bayesian neural networks (BNNs) are a promising method of obtaining statistical uncertainties for neural network predictions but with a higher computational overhead which can limit their practical usage. This work explores the use of high-performance computing with distributed training to address the challenges of training BNNs at scale. We present a performance and scalability comparison of training the VGG-16 and Resnet-18 models on a Cray-XC40 cluster. We demonstrate that network pruning can speed up inference without accuracy loss and provide an opensource software package, BPrune, to automate this pruning. For certain models we find that pruning up to 80% of the network results in only a 7.0% loss in accuracy. With the development of new hardware accelerators for deep learning, BNNs are of considerable interest for benchmarking performance. This analysis of training a BNN at scale outlines the limitations and benefits compared to a conventional neural network. Keywords Bayesian neural networks (BNN) · Distributed training · Model uncertainty · Pruning BNNs

1 Introduction One important challenge for machine and deep learning (DL) practitioners is to develop a robust and accurate understanding of the model uncertainty. The current state-of-the-art deep learning networks are now able to learn representations in complex high-dimensional data for doing context-informed predictions. However, these predictions are often taken blindly with the provided accuracy metric, which may be * Himanshu Sharma [email protected] Elise Jennings [email protected] 1

Argonne Leadership Computing Facility, Argonne National Laboratory, Lemont, IL, USA

13

Vol.:(0123456789)

H. Sharma, E. Jennings

erroneous. Further, for scientific applications of machine learning such as in physics, biology and manufacturing, including accurate model uncertainties, is crucial. Conventional deep neural networks (DNNs) are deterministic models. These models do not provide uncertainty quantification (UQ), model confidence or a probabilistic framework for model comparison. Typically, a probabilistic model is used to compute these quantities of interest. In a deep learning context, DNNs can be integrated with probabilistic models such as Gaussian processes, which induce probability distribution over functions. A Gaussian process can be recovered from these networks in the limit of an infinite number of weights associated with probabilistic distributions (see [1, 2]). In a finite setting, a Bayesian neural network (BNN) is a DNN with probability distributions instead of point estimates for each weight. Several foundational works on this topic such as Mackay [3] and Neal [1] have lead to BNNs gaining in popularity among DL practitioners. In theory these networks can overcome many limitati

Data Loading...

Bayesian neural networks at scale: a performance analysis and pruning study

Recommend Documents

Sparse Bayesian Recurrent Neural Networks

Fine-Grained Channel Pruning for Deep Residual Neural Networks

Pruning Deep Convolutional Neural Networks via Gradient Support Pursuit

Sentiment analysis with deep neural networks: comparative study and performance assessment

Analysis and Study of Solar Radiation Using Artificial Neural Networks

Pruning Long Short Term Memory Networks and Convolutional Neural Networks for Music Emotion Recognition

Large Scale Brain Networks of Neural Fields

A Novel Clustering-Based Filter Pruning Method for Efficient Deep Neural Networks

Pruning Artificial Neural Networks: A Way to Find Well-Generalizing, High-Entropy Sharp Minima

A Pruning Self-Organizing Algorithm to Select Centers of Radial Basis Function Neural Networks

Sensitivity Analysis for Neural Networks

Modeling and Analysis of Large-Scale Networks