Scalable and Reliable Multi-dimensional Sensor Data Aggregation in Data Streaming Architectures

  • PDF / 1,092,120 Bytes
  • 12 Pages / 595.224 x 790.955 pts Page_size
  • 112 Downloads / 208 Views

DOWNLOAD

REPORT


(2020) 4:5

ORIGINAL ARTICLE

Scalable and Reliable Multi-dimensional Sensor Data Aggregation in Data Streaming Architectures ¨ Soren Henning1

· Wilhelm Hasselbring1

Received: 10 February 2020 / Revised: 2 April 2020 / Accepted: 27 August 2020 © The Author(s) 2020

Abstract Ever-increasing amounts of data and requirements to process them in real time lead to more and more analytics platforms and software systems designed according to the concept of stream processing. A common area of application is processing continuous data streams from sensors, for example, IoT devices or performance monitoring tools. In addition to analyzing pure sensor data, analyses of data for entire groups of sensors often need to be performed. Therefore, data streams of the individual sensors have to be continuously aggregated to a data stream for a group. Motivated by a real-world application scenario of analyzing power consumption in Industry 4.0 environments, we propose that such a stream aggregation approach has to allow for aggregating sensors in hierarchical groups, support multiple such hierarchies in parallel, provide reconfiguration at runtime, and preserve the scalability and reliability qualities of stream processing techniques. We propose a stream processing architecture fulfilling these requirements, which can be integrated into existing big data architectures. As all state-of-the-art stream processing frameworks have to handle a trade-off between latency, resourceefficiency, and correctness, our proposed architecture can be configured for low latency and resource-efficient computation or for always ensuring correct results. To assist adopters in choosing appropriate configuration options, we provide an experimental comparison. We present a pilot implementation of our proposed architecture and show how it is used in industry. Furthermore, in experimental evaluations we show that our solution scales linearly with the amount of sensors and provides adequate reliability in the presence of faults. Keywords Big data · Stream processing · Stream aggregation · IoT sensor data

Introduction Stream processing [1, 2] has evolved as a paradigm to process and analyze continuous streams of data, for example, coming from IoT sensors. The rapid development of stream processing engines [3] over the last years has paved the way for applications that process data exclusively

This article belongs to the Topical Collection: Data-Enabled Discovery for Industrial Cyber-Physical Systems Guest Editor: Raju Gottumukkala  S¨oren Henning

[email protected] Wilhelm Hasselbring [email protected] 1

Software Engineering Group, Kiel University, 24098 Kiel, Germany

online, i.e., as soon as it is recorded. Whereas a couple of years ago Lambda architectures were the de facto standard for analytics platforms, currently more and more platforms follow the Kappa architecture pattern, where data is exclusively processed online [4]. Further, entire software system architectures [5] follow patterns such as asynchronously communicating m