Autoscaling tiered cloud storage in Anna

PDF / 1,172,538 Bytes
19 Pages / 595.276 x 790.866 pts Page_size
101 Downloads / 215 Views

SPECIAL ISSUE PAPER

Autoscaling tiered cloud storage in Anna Chenggang Wu1

· Vikram Sreekanti1 · Joseph M. Hellerstein1

Received: 1 February 2020 / Revised: 17 August 2020 / Accepted: 26 August 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract In this paper, we describe how we extended a distributed key-value store called Anna into an autoscaling, multi-tier service for the cloud. In its extended form, Anna is designed to overcome the narrow cost–performance limitations typical of current cloud storage systems. We describe three key aspects of Anna’s new design: multi-master selective replication of hot keys, a vertical tiering of storage layers with different cost–performance trade-offs, and horizontal elasticity of each tier to add and remove nodes in response to load dynamics. Anna’s policy engine uses these mechanisms to balance service-level objectives around cost, latency, and fault tolerance. Experimental results explore the behavior of Anna’s mechanisms and policy, exhibiting orders of magnitude efficiency improvements over both commodity cloud KVS services and research systems. Keywords Autoscaling · Key-value store · Cloud storage system · Data replication · Cost efficiency

1 Introduction As public infrastructure cloud providers have matured in the last decade, the number of storage services they offer has soared. Popular cloud providers like Amazon Web Services (AWS) [9], Microsoft Azure [10], and Google Cloud Platform (GCP) [24] each have at least seven storage options. These services span the spectrum of cost–performance trade-offs: AWS ElastiCache, for example, is an expensive, memory-speed service, while AWS Glacier is extremely high latency and low cost. In between, there are a variety of services such as the Elastic Block Store (EBS), the Elastic File System (EFS), and the Simple Storage Service (S3). Azure and GCP both offer a similar range of storage solutions. Each one of these services is tuned to a unique point in that design space, making it well suited to certain performance goals. Application developers, however, typically deal with a non-uniform distribution of performance requirements. For example, many applications generate a skewed access distribution, in which some data is “hot” while other data is “cold.” This is why traditional storage is assembled hierarchically: hot data is kept in fast, expensive cache while cold data is kept in slow, cheap storage. These access distributions have become more complex in modern settings, because they

B 1

Chenggang Wu [email protected] University of California, Berkeley, 465 Soda Hall, Berkeley 94720, CA, USA

can change dramatically over time. Realistic workloads spike by orders of magnitude, and hot sets shift and resize. These large-scale variations in workload motivate an autoscaling service design, but most cloud storage services today are unable to respond to these dynamics. The narrow performance goals of cloud storage services result in poor cost–performance trade-offs for applications. To improve performance,

Data Loading...

Autoscaling tiered cloud storage in Anna

Recommend Documents

Searchable Storage in Cloud Computing

Data Security in Cloud Storage

Cloud Storage Security

Core Cloud Concepts: Storage

Blockchain-Based Decentralized Cloud Storage

Security for Cloud Storage Systems

Anna-Seghers-Gesellschaft

Data Integrity Checking Supporting Reliable Data Migration in Cloud Storage

Application of Polar Code-Based Scheme in Cloud Secure Storage

Security and Data Storage Aspect in Cloud Computing

A Novel Approach to File Deduplication in Cloud Storage Systems

A Hybrid Architecture for Tiered Storage with Fuzzy Logic and AutoML