Distributed deep learning system for cancerous region detection on Sunway TaihuLight

PDF / 3,609,058 Bytes
14 Pages / 595.276 x 790.866 pts Page_size
90 Downloads / 208 Views

REGULAR PAPER

Distributed deep learning system for cancerous region detection on Sunway TaihuLight GuoFeng Lv1 · MingFan Li1 · Hong An1 · Han Lin1 · Junshi Chen1 · Wenting Han1 · Qian Xiao2 · Fei Wang2 · Rongfen Lin2 Received: 26 February 2020 / Accepted: 2 July 2020 © China Computer Federation (CCF) 2020

Abstract To explore the potential of distributed training on deep neural networks, we implement several distributed algorithms with the basis of swFlow on the world-leading supercomputer, Sunway TaihuLight. Based on two naive designs of parameter server and ring all-reduce, we present the limitation of the communication model and discuss the optimizations for adapting the five-level interconnect architecture of Sunway system. To reduce the communication bottleneck on large scale system, multi-severs and hierarchical ring all-reduce models are introduced. With a benchmark from deep learning-based cancerous region detection algorithm, the average parallel efficiency obtains over 80% for at most 1024 processors. It reveals the great opportunity for joint combination of deep learning and HPC system. Keywords Deep neural network · Parameter server · Ring all-reduce · Cancerous region detection

1 Introduction

* GuoFeng Lv lvguofen@mail.ustc.edu.cn MingFan Li minfan@mail.ustc.edu.cn Hong An han@ustc.edu.cn Han Lin linhan09@mail.ustc.edu.cn Junshi Chen cjuns@mail.ustc.edu.cn Wenting Han hwt@ustc.edu.cn Qian Xiao 413476601@qq.com Fei Wang weedyblues@126.com Rongfen Lin 14986698@qq.com 1

School of Computer Science and Technology, University of Science and Technology of China, Hefei 230026, Anhui, China

Wuxi Jiangnan Institute of Computing Technology, Wuxi 214083, Jiangsu, China

2

Over the past few years, advances in deep learning have driven tremendous progress and outperformed many stateof-the-art approaches in conventional fields. Particularly, deep learning models can now recognize images, process natural language, and defeat humans in challenging strategy games. At this point, deep learning has drawn wider attention from experts across areas, which has been proved by the exascale deep learning for climate analysis from the traditional HPC domains on summit last year. Deep learning usually demands a large amount of training data and powerful computing resources for data analysis. For one thing, there is a growing demand to accelerate smart application to a wide spectrum devices, ranging from high performance compute card of Intel KNL, NVIDA GPUs to the customed AI accelerators of Google TPUs or Cambrian series. For another, recent experiments where DeepLabv3+ scales up to 27360 V100 GPUs with a peak throughput of 1.13EF/s show vast potential for distributed training Kurth et al. (2018). In general, the high performance supercomputer has become an appealing substitute for redundant model training of deep learning. With the increase of cluster scale and high performance accelerators, the heavy communication has become the bottlenecks for distributed application (Abadi et al. 2016; Akiba et al. 2017a).

13

Data Loading...

Distributed deep learning system for cancerous region detection on Sunway TaihuLight

Recommend Documents

Parallel Optimization of Stencil Computation Base on Sunway TaihuLight

Parallelization and Optimization of Large-Scale CFD Simulations on Sunway TaihuLight System

On Fatigue Driving Detection System Based on Deep Learning

Deep Learning for In-Vehicle Intrusion Detection System

Automatic Feature Learning for Glaucoma Detection Based on Deep Learning

A Method for Windows Malware Detection Based on Deep Learning

Pain Detection Using Deep Learning with Evaluation System

Study of Region Convolutional Neural Network Deep Learning for Fire Accident Detection

Melanoma Detection Using Deep Learning

Research on Automatic Target Detection and Recognition System Based on Deep Learning Algorithm

Learning region-guided scale-aware feature selection for object detection

Intelligent Control Location Detection System Based on Machine Vision and Deep Learning