On Scheduling Coflows

  • PDF / 1,797,362 Bytes
  • 26 Pages / 439.37 x 666.142 pts Page_size
  • 43 Downloads / 156 Views

DOWNLOAD

REPORT


On Scheduling Coflows Saba Ahmadi1   · Samir Khuller3 · Manish Purohit2 · Sheng Yang1 Received: 5 October 2017 / Accepted: 18 June 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Applications designed for data-parallel computation frameworks such as MapReduce usually alternate between computation and communication stages. Coflow scheduling is a recent popular networking abstraction introduced to capture such application-level communication patterns in datacenters. In this framework, a datacenter is modeled as a single non-blocking switch with m input ports and m outj put ports. A coflow j is a collection of flow demands {dio }i∈{1,…,m},o∈{1,…,m} that is said to be complete once all of its requisite flows have been scheduled. We consider the offline coflow scheduling problem with and without release times to minimize the total weighted completion time. Coflow scheduling generalizes the well studied concurrent open shop scheduling problem and is thus NP-hard. Qiu et  al. (in: ACM Symposium on parallelism in algorithms and architectures. ACM, New York, pp 294–303, 2015) obtain the first constant approximation algorithms for this problem via√ LP rounding and give a deterministic 67 -approximation and a randomized 3 (9 + 163 2 ) ≈ 16.54-approximation algorithm. In this paper, we give a combinatorial algorithm that yields a deterministic 5-approximation algorithm for coflow scheduling with release times, and a deterministic 4-approximation for the case without release times. As for concurrent open shop problem with release times, we give a combinatorial 3-approximation algorithm.

A preliminary version of this paper appears in Proceedings of IPCO 2017. * Saba Ahmadi [email protected] Samir Khuller [email protected] Manish Purohit [email protected] Sheng Yang [email protected] 1

University of Maryland, College Park, USA

2

Google, Mountain View, USA

3

Northwestern University, Evanston, USA



13

Vol.:(0123456789)

Algorithmica

Keywords  Coflow scheduling · Concurrent open shop · Approximation algorithms

1 Introduction Large scale data centers have emerged as the dominant form of computing infrastructure over the last decade. The success of data-parallel computing frameworks such as MapReduce [8], Hadoop [1], and Spark [21] has led to a proliferation of applications that are designed to alternate between computation and communication stages. Typically, the intermediate data generated by a computation stage needs to be transferred across different machines during a communication stage for further processing. For example, there is a “Shuffle” phase between every consecutive “Map” and “Reduce” phases in MapReduce. With an increasing reliance on parallelization, these communication stages are responsible for a large amount of data transfer in a datacenter. Chowdhury and Stoica [4] introduced coflows as an effective networking abstraction to represent the collective communication requirements of a job. In this paper, we consider the problem of scheduling coflows to minim