Gossip-based visibility control for high-performance geo-distributed transactions

  • PDF / 1,591,452 Bytes
  • 22 Pages / 595.276 x 790.866 pts Page_size
  • 52 Downloads / 172 Views

DOWNLOAD

REPORT


SPECIAL ISSUE PAPER

Gossip-based visibility control for high-performance geo-distributed transactions Hua Fan1

· Wojciech Golab2

Received: 2 February 2020 / Revised: 12 August 2020 / Accepted: 14 August 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract Providing ACID transactions under conflicts across globally distributed data is the Everest of transaction processing protocols. Transaction processing in this scenario is particularly costly due to the high latency of cross-continent network links, which inflates concurrency control and data replication overheads. To mitigate the problem, we introduce Ocean Vista—a novel distributed protocol that guarantees strict serializability. We observe that concurrency control and replication address different aspects of resolving the visibility of transactions, and we address both concerns using a multi-version protocol that tracks visibility using version watermarks and arrives at correct visibility decisions using efficient gossip. Gossiping the watermarks enables asynchronous transaction processing and acknowledging transaction visibility in batches in the concurrency control and replication protocols, which improves efficiency under high cross-data center network delays. In particular, Ocean Vista can access conflicting transactions in parallel and supports efficient write-quorum/read-one access using one round trip in the common case. We demonstrate experimentally in a multi-data center cloud environment that our design outperforms a leading distributed transaction processing engine (TAPIR) more than tenfold in terms of peak throughput, albeit at the cost of additional latency for gossip and a more restricted transaction model. The latency penalty is generally bounded by one wide area network (WAN) round trip time (RTT), and in the best case (i.e., under light load) our system nearly breaks even with TAPIR by committing transactions in around one WAN RTT. Keywords Distributed transactions · Concurrency control · Replication · Geo-distributed data

1 Introduction Cloud providers make it easier to deploy applications and data across geographically distributed data centers for fault tolerance, elastic scalability, service localization, and cost efficiency. With such infrastructures, the medium and small enterprises can also build globally distributed storage systems to serve customers around the world. Distributed transactions in geographically distributed database systems, while being convenient to applications thanks to ACID semantics, are notorious for their high overhead, especially for highcontention workloads and globally distributed data.

B

Hua Fan [email protected] Wojciech Golab [email protected]

1

Alibaba Group, Hangzhou, China

2

University of Waterloo, Waterloo, Canada

The overhead of geo-distributed transactions arises not only from the transaction commitment and concurrency control protocols that coordinate different shards for atomicity and isolation, but also from the replication protocol (e.g., Paxos [21]) that