Research on data pre-deployment in information service flow of digital ocean cloud computing
- PDF / 1,945,594 Bytes
- 11 Pages / 595.276 x 841.89 pts (A4) Page_size
- 98 Downloads / 180 Views
Research on data pre-deployment in information service flow of digital ocean cloud computing SHI Suixiang1,2, XU Lingyu3*, DONG Han1,2, WANG Lei3, WU Shaochun3, QIAO Baiyou4, WANG Guoren4 1
National Marine Data and Information Service, State Oceanic Administration, Tianjin 300171, China Key Laboratory of Digital Ocean, State Oceanic Administration, Tianjin 300171, China 3 College of Computer Engineering and Science, Shanghai University, Shanghai 200072, China 4 College of Information Science and Engineering, Northeastern University, Shenyang 110819, China 2
Received 4 April 2014; accepted 27 May 2014 ©The Chinese Society of Oceanography and Springer-Verlag Berlin Heidelberg 2014
Abstract Data pre-deployment in the HDFS (Hadoop distributed file systems) is more complicated than that in traditional file systems. There are many key issues need to be addressed, such as determining the target location of the data prefetching, the amount of data to be prefetched, the balance between data prefetching services and normal data accesses. Aiming to solve these problems, we employ the characteristics of digital ocean information service flows and propose a deployment scheme which combines input data prefetching with output data oriented storage strategies. The method achieves the parallelism of data preparation and data processing, thereby massively reducing I/O time cost of digital ocean cloud computing platforms when processing multi-source information synergistic tasks. The experimental results show that the scheme has a higher degree of parallelism than traditional Hadoop mechanisms, shortens the waiting time of a running service node, and significantly reduces data access conflicts. Key words: HDFS, data prefetching, cloud computing, service flow, digital ocean Citation: Shi Suixiang, Xu Lingyu, Dong Han, Wang Lei, Wu Shaochun, Qiao Baiyou, Wang Guoren. 2014. Research on data predeployment in information service flow of digital ocean cloud computing. Acta Oceanologica Sinica, 33(9): 82–92, doi: 10.1007/ s13131-014-0520-8
1 Introduction With the explosive growth of information extracted from large-scale sensor networks of digital ocean, marking the big data era, new challenges appear in how to manage and process this data. Data-centered digital ocean cloud computing platforms are increasingly used to solve these problems (Shi et al., 2013). Marine information processing models and synergistic composite services of multi-source data are the main components of marine data processing. Based on Hadoop cloud platform, we build composite service flow models, store marine models resource and other information into HDFS, customize services in the form of service flows, and then submit so-formed service flows to the cloud platform to process service requests submitted by users (Xu et al., 2010). Consequently, these measures enable the full utilization and integration of the existing systems, thus greatly saving costs; while at the same time protect intellectual property rights. Generally, cloud platforms use a data-centered di
Data Loading...