Load Shedding for Window Queries Over Continuous Data Streams

To cope with bursty data arrivals, a stream query processor may perform load shedding to cut the system load by discarding some portion of tuples kept in memory. Load shedding can be conducted in a stateful operator such as join or aggregation in a query

  • PDF / 168,394 Bytes
  • 6 Pages / 439.37 x 666.14 pts Page_size
  • 73 Downloads / 231 Views

DOWNLOAD

REPORT


*

Abstract To cope with bursty data arrivals, a stream query processor may perform load shedding to cut the system load by discarding some portion of tuples kept in memory. Load shedding can be conducted in a stateful operator such as join or aggregation in a query plan tree or in a dedicated operator, called load shedder, which is typically placed at the entry of the query processor. In this paper, we show that the load shedding can also be performed in window operators. In general, window operators are placed in the initial phase of a query plan and designed to ignore tuples whose arrivals are not in a predefined order. With the functionality to control the number of tuple discards, they can play the role of load shedding. In the proposed method, the conventional load shedder is not necessary and the number of query processing steps can be reduced, which leads to performance improvement of the continuous query processing. Keywords Data streams · Load shedding · Sliding windows · Disorder control

1

Introduction

There has been substantial interest in applications that monitor the continuous streams of data items such as stock exchanges, network measurements, web page visits, sensor readings and so on [1]. In general, these applications are required to produce their results in a timely manner over rapidly incoming tuples (i.e. records) whose arrival rate may exceed the output rate of queries in a system. To meet the time constraint, many stream query processors such as Aurora [2] and STREAM [3] employ a load shedder which discards some portion of input tuples when a system cannot cope with the arrival rate of input tuples. K.R. Kim · H.G. Kim() Department of Computer Engineering, Sahmyook University, Seoul 139-742, Republic of Korea e-mail: [email protected] © Springer Science+Business Media Singapore 2015 D.-S. Park et al. (eds.), Advances in Computer Science and Ubiquitous Computing, Lecture Notes in Electrical Engineering 373, DOI: 10.1007/978-981-10-0281-6_23

159

160

K.R. Kim and H.G. Kim

So far, many load shedding techniques have been proposed, which can be classified into the following two groups.  Operator-level load shedding In this scheme, load shedding is performed in each stateful operator such as join or aggregation. Regarding this, Viglas et al. [4] suggested a method to maximize the number of output tuples in a sliding window join. Babcock et al. [5] proposed an algorithm to minimize the degree of inaccuracy in window aggregate queries.  System-level load shedding In this scheme, load shedding is performed in a dedicated operator placed at the entry of a stream query processor. Tatbul et al. [6] presented semantic load shedding where tuples are discarded based on values designated by a user-defined QoS (Quality of Service) specification. In this paper, we show that the load shedding can also be performed in window operators. Our motivation is that input tuples can be discarded from the window operators. More specifically, window operators are designed not to accept tuples whose arrivals are n