Architectural assessment of NoSQL and NewSQL systems

  • PDF / 4,539,185 Bytes
  • 46 Pages / 439.37 x 666.142 pts Page_size
  • 89 Downloads / 200 Views

DOWNLOAD

REPORT


Architectural assessment of NoSQL and NewSQL systems Natalia Chaudhry1   · Muhammad Murtaza Yousaf1

© Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract With the recent trend towards big data, a number of scalable data management systems: NoSQL and NewSQL are developed to manage massive data effectively. The algorithms involved in the architectural design of a data management system defines the response time of an application. The behavior and performance of different NoSQL and NewSQL systems vary on the basis of these architectural aspects. Hence, the architectural assessment of a data management system is a vital task to perform in order to understand their weaknesses and strengths. Therefore, this paper assesses the architecture of some well-known NoSQL and NewSQL systems in detail. To enhance the clarity of discussion and analysis, we identified and grouped together the logically related architectural features, forming a feature vector (FV). Feature vectors related to transactional properties, fault tolerance, data storage, and data handling are designed and involved in architectural assessment. Various significant features are identified and assigned to a feature vector. Some well-known NoSQL and NewSQL systems are analyzed, compared, and discussed in depth with respect to these feature vectors. The discussion involves describing the algorithms used in implementation of a particular architectural feature by each of the systems and their suitability analysis in various scenarios. Important guidelines are presented that helps in filtering the potential data management systems on the basis of application requirements. Keywords  Big data · NoSQL · NewSQL · Transactional properties · Fault tolerance · Data storage · Data handling

1 Introduction The eminence of big data has made it necessary for data management systems to handle high volume of data in an efficient manner. Moreover, the data generated by most of the applications lack a predefined structure. As the structure and complexity * Natalia Chaudhry [email protected] 1



PUCIT, University of the Punjab, Lahore, Pakistan

13

Vol.:(0123456789)



Distributed and Parallel Databases

of data makes it difficult to be processed by the traditional relational data management systems [93], NoSQL data management systems came out as a solution. To deal with huge workload and scalability challenges, it becomes necessary to use distributed computing environments to avoid latency and performance issues. Hence, most of the NoSQL data management systems follow a distributed architecture. In a distributed data management system, the data is partitioned across multiple physical nodes to satisfy the needs of on-demand scalability, storage flexibility, and high availability. In this way, it is easy to scale data horizontally i.e., spreading data on multiple physical nodes. Various aspects related to storage and management of data come under the architecture of distributed data management system. These are partitioning, replication, concurrency