Towards decomposition based multi-objective workflow scheduling for big data processing in clouds

  • PDF / 3,219,627 Bytes
  • 25 Pages / 595.276 x 790.866 pts Page_size
  • 113 Downloads / 170 Views

DOWNLOAD

REPORT


(0123456789().,-volV)(0123456789().,-volV)

Towards decomposition based multi-objective workflow scheduling for big data processing in clouds Emmanuel Bugingo1 • Defu Zhang1 • Zhaobin Chen1 • Wei Zheng1 Received: 28 February 2020 / Revised: 2 November 2020 / Accepted: 5 November 2020 Ó Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract A workflow is a group of tasks that are processed in a particular order to complete an application. Also, it is a popular paradigm used to model complex big-data applications. Executing complex applications in a distributed system such as cloud or cluster implicates optimization of several conflicting objectives such as monetary cost, energy consumption, total execution time of the application (makespan). Regardless of this trend, most of the workflow scheduling approaches focused on single or bi-objective optimization problem. In this paper, we considered the problem of scheduling workflows in a cloud environment as a multi-objective optimization problem, and hence proposed a multi-objective workflowscheduling algorithm based on decomposition. The proposed algorithm is capable of finding optimal solutions with a single run. Our evaluation results show that, by a single run, the proposed approach manages to obtain the Pareto Front solutions which are at least as good as schedules produced by running a single-objective scheduling algorithm with constraints for multiple times. Keywords Multi-objective optimization  Workflow scheduling  Cloud computing  Decomposition approach

1 Introduction Recently, workflow has become a popular paradigm to model the execution process of big data application in distributed environments such as clouds and clusters [1]. Since then, research on workflow scheduling has got much attention with an objective of producing optimal execution time. It is well-known that workflow scheduling in a heterogeneous environment is NP-Complete [2]. Traditionally, optimizing the overall execution time (makespan) of the workflow has been an important and common objective of workflow scheduling. Many works in literature have designed different heuristic algorithms to get the schedule with minimized makespan. However, as using cloud computing gains popularity, makespan optimization is no longer the only objective to be considered for optimization during workflow scheduling. Many other significant objectives that can be recognized as important as & Wei Zheng [email protected] 1

Department of Computer Science, School of Informatics, Xiamen University, Xiamen, China

makespan have arisen such as cost, energy, reliability, utilization, etc. Those objectives need to be taken into consideration together with makespan during workflow scheduling. Therefore, modern cloud workflow scheduling algorithms must be able to optimize more than one objectives at the same time. Generally, the main concern for cloud computing customers, when selecting virtual machine (VM) instances to execute their workflows, is the monetary cost. Cloud VM instances renting price i