Fast Scheduling in Distributed Transactional Memory

  • PDF / 1,415,117 Bytes
  • 27 Pages / 439.642 x 666.49 pts Page_size
  • 54 Downloads / 199 Views

DOWNLOAD

REPORT


Fast Scheduling in Distributed Transactional Memory Costas Busch1

· Maurice Herlihy2 · Miroslav Popovic3 · Gokarna Sharma4

Accepted: 12 September 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract We investigate scheduling algorithms for distributed transactional memory systems where transactions residing at nodes of a communication graph operate on shared, mobile objects. A transaction requests the objects it needs, executes once those objects have been assembled, and then possibly forwards those objects to other waiting transactions. Minimizing execution time in this model is known to be NP-hard for arbitrary communication graphs, and also hard to approximate within any factor smaller than the size of the graph. Nevertheless, networks on chips, multi-core systems, and clusters are not arbitrary. Here, we explore efficient execution schedules in specialized graphs likely to arise in practice: Clique, Line, Grid, Cluster, Hypercube, Butterfly, and Star. In most cases, when individual transactions request k objects, we obtain solutions close to a factor O(k) from optimal, yielding nearoptimal solutions for constant k. These execution times approximate the TSP tour lengths of the objects in the graph. We show that for general networks, even for two objects (k = 2), it is impossible to obtain execution time close to the objects’ optimal TSP tour lengths, which is why it is useful to consider more realistic network models. To our knowledge, this is the first attempt to obtain provably fast schedules for distributed transactional memory. Keywords Transactional memory · Distributed systems · Execution time · Approximation · Data-flow model · Scheduling · Contention

A preliminary version of this paper appears in the Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pages 173-182, Washington, DC, USA, July 2017 [4] .  Costas Busch

[email protected]

Extended author information available on the last page of the article.

Theory of Computing Systems

1 Introduction Concurrent processes (threads) need to synchronize to avoid introducing inconsistencies in shared data objects. Traditional synchronization mechanisms such as locks and barriers have well-known limitations and pitfalls, including deadlock, priority inversion, reliance on programmer conventions, and vulnerability to failure or delay. Transactional memory [16, 35] (TM) has emerged as an alternative. Using TM, code is split into transactions, blocks of code that appear to execute atomically with respect to one another. Transactions are executed speculatively: synchronization conflicts or failures may cause an executing transaction to abort: its effects are rolled back and the transaction is restarted. In the absence of conflicts or failures, a transaction typically commits, causing its effects to become visible. Several commercial processors provide direct hardware support for TM, including Intel’s Haswell [19] and IBM’s Blue Gene/Q [14], zEnterprise EC12 [26], and Power8 [5]. There ar