A review of data replication based on meta-heuristics approach in cloud computing and data grid

  • PDF / 2,530,880 Bytes
  • 28 Pages / 595.276 x 790.866 pts Page_size
  • 7 Downloads / 177 Views

DOWNLOAD

REPORT


(0123456789().,-volV)(0123456789(). ,- volV)

METHODOLOGIES AND APPLICATION

A review of data replication based on meta-heuristics approach in cloud computing and data grid Najme Mansouri1 • Mohammad Masoud Javidi1

 Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract Heterogeneous distributed computing environments are emerging for developing data-intensive (big data) applications that require to access huge data files. Therefore, effective data management like efficient access and data availability has become critical requirement in these systems. Data replication is an essential technique applied to achieve these goals through storing multiple replicas in a wisely manner. There are replication algorithms that address some metrics such as reliability, availability, bandwidth consumption, storage usage, response time. In this paper, we present different issues involved in data replication and discuss the key points of the recent algorithms with a tabular representation of all those features. The focus of the review is the existing algorithms of data replication that are based on the meta-heuristic techniques. This review will enable the readers to see that previous studies contributed response time to the data replication, but the contribution of the energy consumption and security improvement has not been considerable well. Moreover, the impact of meta-heuristic algorithms on data replication performance is investigated in a simulation study. Finally, open issues and future challenges have been presented in this research work. Keywords Data replication  Cloud computing  Data grid  Meta-heuristic

1 Introduction In some scientific application areas like earth observations, huge amounts of data are becoming a significant part of the shared resources. Such large-scale datasets are usually distributed in different data centers. Data replication technique is usually applied to manage large data in a distributed manner. The replication algorithm tries to store multiple replicas to achieve efficient and fault-tolerant data access in the systems (Mansouri et al. 2019; Dinesh Reddy et al. 2019). Although data management has been previously investigated (Mansouri 2014; Alghamdi et al. 2017; Pitchai et al. 2019; Mansouri and Javidi 2018a), very few of the available algorithms take a holistic view of the different costs and benefits of replication.

Communicated by V. Loia. & Najme Mansouri [email protected] Mohammad Masoud Javidi [email protected] 1

Department of Computer Science, Shahid Bahonar University of Kerman, Kerman, Iran

Many of them usually adopt replication process to enhance data availability and efficiency. These metrics are improved when the number of copies in the system increases. But, the most critical fact they ignored is that data replication leads to energy consumption and financial costs for the provider. Consequently, introducing a data replication technique that considers the balancing of a variety of trade-offs is necessary. In recent times, optimization is a booming field