Conquering Big Data with High Performance Computing

This book provides an overview of the resources and research projects that are bringing Big Data and High Performance Computing (HPC) on converging tracks. It demystifies Big Data and HPC for the reader by covering the primary resources, middleware, appli

  • PDF / 9,169,541 Bytes
  • 328 Pages / 439.42 x 683.15 pts Page_size
  • 39 Downloads / 227 Views

DOWNLOAD

REPORT


Conquering Big Data with High Performance Computing

Conquering Big Data with High Performance Computing

Ritu Arora Editor

Conquering Big Data with High Performance Computing

123

Editor Ritu Arora Texas Advanced Computing Center Austin, TX, USA

ISBN 978-3-319-33740-1 DOI 10.1007/978-3-319-33742-5

ISBN 978-3-319-33742-5 (eBook)

Library of Congress Control Number: 2016945048 © Springer International Publishing Switzerland 2016 Chapter 7 was created within the capacity of US governmental employment. US copyright protection does not apply. This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. Printed on acid-free paper This Springer imprint is published by Springer Nature The registered company is Springer International Publishing AG Switzerland

Preface

Scalable solutions for computing and storage are a necessity for the timely processing and management of big data. In the last several decades, High-Performance Computing (HPC) has already impacted the process of developing innovative solutions across various scientific and nonscientific domains. There are plenty of examples of data-intensive applications that take advantage of HPC resources and techniques for reducing the time-to-results. This peer-reviewed book is an effort to highlight some of the ways in which HPC resources and techniques can be used to process and manage big data with speed and accuracy. Through the chapters included in the book, HPC has been demystified for the readers. HPC is presented both as an alternative to commodity clusters on which the Hadoop ecosystem typically runs in mainstream computing and as a platform on which alternatives to the Hadoop ecosystem can be efficiently run. The book includes a basic overview of HPC, High-Throughput Computing (HTC), and big data (in Chap. 1). It introduces the readers to the various types of HPC and high-end storage resources that can be used for efficiently managing the entire big data lifecycle (in Chap. 2). Data movement across various systems (from stora