Improvising and Optimizing Resource Utilization in Big Data Processing

This paper is to improvising and optimizing the scenario of Big data processing in cloud computing. A homogeneous cluster setup supports static nature of processing which is a huge disadvantage for optimizing the response time towards clients. In order to

  • PDF / 194,010 Bytes
  • 9 Pages / 439.37 x 666.142 pts Page_size
  • 99 Downloads / 173 Views

DOWNLOAD

REPORT


Abstract This paper is to improvising and optimizing the scenario of Big data processing in cloud computing. A homogeneous cluster setup supports static nature of processing which is a huge disadvantage for optimizing the response time towards clients. In order to avail utmost client satisfaction, the host server needs to be upgraded with the latest technology to fulfil all requirements. Big data processing is a common frequent event in today’s Internet and the proposed framework improvises the response time. This will also make sure that the user gets its entire requirement fulfilled in optimal time. In order to avail utmost client satisfaction, the server needs to eliminate homogeneous cluster setup that is encountered usually in parallel data processing. The homogeneous cluster setup is static in nature and dynamic allocation of resources is not possible in this kind of environment. This will improve the overall resource utilization and, consequently, reduce the processing cost. Keywords Data mining

 Data warehousing  Parallel data processing

1 Introduction In today’s digital generation, a huge amount of data has been processed parallel in the Internet. Providing optimal data processing in least time improvises the output of parallel data processing. There are many users that try to access the same data over the Internet and it is a stimulating job for the server to provide optimal outcome. The large volume of data they have to deal with everyday has made old style data bank solutions prohibitively expensive. Instead, these companies have Praveen Kumar (&) Department of Computer Science, NIMS University, Jaipur, India e-mail: [email protected] V.S. Rathore Shri Karni College, Jaipur, Rajasthan, India e-mail: [email protected] © Springer Science+Business Media Singapore 2016 M. Pant et al. (eds.), Proceedings of Fifth International Conference on Soft Computing for Problem Solving, Advances in Intelligent Systems and Computing 436, DOI 10.1007/978-981-10-0448-3_28

345

346

Praveen Kumar and V.S. Rathore

promoted an architectural paradigm based on a large number of servers. Problems like processing crawled documents or restoring a web file are split into several independent subjobs, distributed among the available nodes, and computed in parallel. Big data processing is a key feature in accessing and operating on huge set of data’s [1]. There are several ways available to process data parallel which improvises time and response. Today’s framework has a huge disadvantage that can be termed by a homogeneous cluster setup [2]. To be more precise, when a job manager is allocated with a job it then divides that job into many subjobs and it allocates to each task manager [3]. Now once this cluster is setup and the Big data processing begins, there can be no possible ways by which we can add more task managers or eliminate any executed task managers until all have executed. This is an ambiguous situation when there can be no data or resources allocation during the middle of data processing. This creates