To cloud or not to cloud: an on-line scheduler for dynamic privacy-protection of deep learning workload on edge devices
- PDF / 1,007,824 Bytes
- 16 Pages / 595.276 x 790.866 pts Page_size
- 1 Downloads / 142 Views
REGULAR PAPER
To cloud or not to cloud: an on‑line scheduler for dynamic privacy‑protection of deep learning workload on edge devices Yibin Tang1,2,4 · Ying Wang1,2 · Huawei Li1,2,3 · Xiaowei Li1,2 Received: 6 July 2020 / Accepted: 5 October 2020 © China Computer Federation (CCF) 2020
Abstract Recently deep learning applications are thriving on edge and mobile computing scenarios, due to the concerns of latency constraints, data security and privacy, and other considerations. However, because of the limitation of power delivery, battery lifetime and computation resource, offering real-time neural network inference ability has to resort to the specialized energy-efficient architecture, and sometimes the coordination between the edge devices and the powerful cloud or fog facilities. This work investigates a realistic scenario when an on-line scheduler is needed to meet the requirement of latency even when the edge computing resources and communication speed are dynamically fluctuating, while protecting the privacy of users as well. It also leverages the approximate computing feature of neural networks and actively trade-off excessive neural network propagation paths for latency guarantee even when local resource provision is unstable. Combining neural network approximation and dynamic scheduling, the real-time deep learning system could adapt to different requirements of latency/ accuracy and the resource fluctuation of mobile-cloud applications. Experimental results also demonstrate that the proposed scheduler significantly improves the energy efficiency of real-time neural networks on edge devices. Keywords Real-time · Deep learning · Edge computing · Privacy protection
1 Introduction Deep neural networks (DNNs) have shown outstanding performance and versatility in areas from computer vision, virtual reality to speech processing or even generalpurpose computing. Nowadays, the applications of deep learning technology spread to the mobile apparatus, such * Huawei Li [email protected] Yibin Tang [email protected] Ying Wang [email protected] Xiaowei Li [email protected] 1
State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
2
University of Chinese Academy of Sciences, Beijing, China
3
Peng Cheng Laboratory, Shenzhen, China
4
Wuhan Digital Engineering Institute, Wuhan, China
as smartphones, robotics, surveillance systems, and other embedded systems or IoT devices, making them more ‘intelligent’ (Gubbi et al. 2013). However, the edge or embedded computing devices are constrained in power when processing complex applications, such as deep convolutional neural networks (CNN) models. Besides, edge deep learning applications, such as autonomous driving assistant systems (Redmon et al. 2016; Liu et al. 2016), speech interaction on wearable devices (Lane and Georgiev 2015), and other latency-sensitive tasks, are stressed by the requirement of real-time processing. It means that guaranteeing quality of services (QoS) for the
Data Loading...