Job placement using reinforcement learning in GPU virtualization environment

  • PDF / 2,030,146 Bytes
  • 16 Pages / 595.276 x 790.866 pts Page_size
  • 30 Downloads / 241 Views

DOWNLOAD

REPORT


(0123456789().,-volV)(0123456789(). ,- volV)

Job placement using reinforcement learning in GPU virtualization environment Jisun Oh1 • Yoonhee Kim1 Received: 25 December 2019 / Revised: 25 December 2019 / Accepted: 31 December 2019 Ó Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Graphics Processing Units (GPU) are widely used for high-speed processes in the computational science areas of biology, chemistry, meteorology, etc. and the machine learning areas of image and video analysis. Recently, data centers and cloud companies have adopted GPUs to provide them as computing resources. Because the majority of cloud providers allocate the GPU resource to users in an exclusive access method, the allocated GPU resource may not be all used. Although the method of allocating a GPU resource to multiple users for sharing can increase the resource utilization, performance degradation may occur in individual jobs because of interference between different jobs. It is difficult for a cloud provider to predict or control the performance of various applications executed on various cloud resources by considering their characteristics heuristically. Therefore, an intelligent job placement technique is required to minimize the interference between different jobs and increase resource utilization. This study defines the resource utilization history of applications and proposes a reinforcement learning-based job placement technique, which uses it as an input. For resource utilization history learning, a deep reinforcement learning model (DQN) is used. As a result of learning, the current resource’s state is not exceeded, and the resource is still provided by predicting which commonly placed jobs will have less impact on the total performance when executed simultaneously. This approach prevents the performance degradation of applications with diverse execution characteristics and increases the resource utilization by executing the applications while sharing the resources. The superiority of this study is demonstrated by using the proposed learning method and other methods to analyze workloads with various resource utilization characteristics. Through the experiments, it is proven that the proposed method facilitates a reduction of the total execution time and the effective use of resources, while the maintaining performance. Keywords GPU  DQN learning  Interference prediction  Multiple job placement

1 Introduction A Graphics Processing Unit (GPU) consists of thousands of processing cores and performs parallel procession operations at a high level. Based on the advantage of highly accelerated processing of computing tasks, general-purpose GPUs (GPGPUs) are broadly used in diverse areas such as machine learning (ML) and high-performance computing

& Yoonhee Kim [email protected] Jisun Oh [email protected] 1

(HPC) applications. Hence, cloud and server infrastructure providers provide GPU servers to users for the execution of various applications. Many large cloud providers, such as Amazon EC2 [1], N