Data Science and Big Data Computing Frameworks and Methodologies

This illuminating text/reference surveys the state of the art in data science, and provides practical guidance on big data analytics. Expert perspectives are provided by an authoritative collection of thirty-six researchers and practitioners from around t

  • PDF / 5,598,824 Bytes
  • 332 Pages / 439.42 x 683.15 pts Page_size
  • 34 Downloads / 263 Views

DOWNLOAD

REPORT


Data Science and Big Data Computing Frameworks and Methodologies

Data Science and Big Data Computing

ThiS is a FM Blank Page

Zaigham Mahmood Editor

Data Science and Big Data Computing Frameworks and Methodologies

Editor Zaigham Mahmood Department of Computing and Mathematics University of Derby Derby, UK Business Management and Informatics Unit North West University Potchefstroom, South Africa

ISBN 978-3-319-31859-2 ISBN 978-3-319-31861-5 DOI 10.1007/978-3-319-31861-5

(eBook)

Library of Congress Control Number: 2016943181 © Springer International Publishing Switzerland 2016 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. Printed on acid-free paper This Springer imprint is published by Springer Nature The registered company is Springer International Publishing AG Switzerland

To Rehana Zaigham Mahmood: For her Love and Support

ThiS is a FM Blank Page

Preface

Overview Huge volumes of data are being generated by commercial enterprises, scientific domains and general public. According to a recent report by IBM, we create 2.5 quintillion bytes of data every day. According to another recent research, data production will be 44 times greater in 2020 than it was in 2009. Data being a vital organisational resource, its management and analysis is becoming increasingly important: not just for business organisations but also for other domains including education, health, manufacturing and many other sectors of our daily life. This data, due to its volume, variety and velocity, often referred to as Big Data, is no longer restricted to sensory outputs and classical databases; it also includes highly unstructured data in the form of textual documents, webpages, photos, spatial and multimedia data, graphical information, social media comments and public opinions. Since Big Data is characterised by massive sample sizes, highdimensionality and intrinsic heterogeneity, and since noise accumulation, spurious correlation and incidental endogeneity are common features of such dat