Post-Scheduling Optimization of Parallel Programs

No current task schedulers for distributed-memory MIMD machines produce optimal schedules in general, so it is possible for optimizations to be performed after scheduling. Message merging, communication reordering, and task duplication are shown to be eff

  • PDF / 130,104 Bytes
  • 5 Pages / 431 x 666 pts Page_size
  • 42 Downloads / 223 Views

DOWNLOAD

REPORT


2

Stephen Shafer and Kanad Ghose 1

Lockheed Martin Postal Systems, 1801 State Route 17 C, Owego, NY 13827-3998 [email protected] 2 Department of Computer Science, State University of New York, Binghamton, NY 13902-6000 [email protected]

Abstract. No current task schedulers for distributed-memory MIMD machines produce optimal schedules in general, so it is possible for optimizations to be performed after scheduling. Message merging, communication reordering, and task duplication are shown to be effective post-scheduling optimizations. The percentage decrease in execution time is dependent on the original schedule, so improvements are not always achievable for every schedule. However, significant decreases in execution time (up to 50%) are possible, which makes the investment in extra processing time worthwhile.

1. Introduction When preparing a program for execution on a distributed-memory MIMD machine, choosing the right task scheduler can make a big difference in the final execution time of the program. Many static schedulers are available which use varying methods such as list scheduling, clustering, and graph theory for deriving a schedule. Except for a few restricted cases, however, no scheduler can consistently produce an optimal answer. Thus, post-scheduling optimizations are still possible. Assuming that tasks are indivisible, and that the ordering of tasks on each processor does not change, there are two possible sources for improvements in execution time: the tasks themselves, and the inter-processor messages. There are two different communication-related optimizations that we explore here. First, there is the possibility that pairs of messages from one processor to another may be combined, or merged, so that the data from both are sent as one message. This can result in a reduction of the execution time of the program. The second communication-related optimization deals with those tasks that have multiple messages being sent to tasks on other processors. For those tasks, the order of the messages may be changed so that more time-critical messages are sent first. We show two different orderings and how they affect the final execution time. Finally, we have implemented a task-related optimization that duplicates tasks when the messages that they send are causing delays in the destination tasks.

P. Amestoy et al. (Eds.): Euro-Par’99, LNCS 1685, pp. 440-444, 1999.  Springer-Verlag Berlin Heidelberg 1999

Post-Scheduling Optimization of Parallel Programs

441

2. Message Merging 2.1 The Concept of Merging Wang and Mehrotra [WM 91] proposed a technique for merging messages that correspond to data dependencies. Their approach reduces the number of communications by combining a pair of messages from a source processor to the same destination processor into a single message that satisfies the data dependencies. In [SG 95], we showed that this method can actually increase the execution time, and might even introduce deadlock. We also showed that message merging can reduce execution time without i