Fault-Tolerance Techniques for High-Performance Computing

This timely text/reference presents a comprehensive overview of fault tolerance techniques for high-performance computing (HPC).The text opens with a detailed introduction to the concepts of checkpoint protocols and scheduling algorithms, prediction, repl

PDF / 8,942,512 Bytes
325 Pages / 453.543 x 683.15 pts Page_size
26 Downloads / 384 Views

DOWNLOAD

REPORT

Thomas Herault Yves Robert Editors

Fault-Tolerance Techniques for HighPerformance Computing

Computer Communications and Networks Series editor A.J. Sammes Centre for Forensic Computing Cranﬁeld University, Shrivenham Campus Swindon, UK

The Computer Communications and Networks series is a range of textbooks, monographs and handbooks. It sets out to provide students, researchers, and nonspecialists alike with a sure grounding in current knowledge, together with comprehensible access to the latest developments in computer communications and networking. Emphasis is placed on clear and explanatory styles that support a tutorial approach, so that even the most complex of topics is presented in a lucid and intelligible manner.

More information about this series at http://www.springer.com/series/4198

Thomas Herault Yves Robert •

Editors

Fault-Tolerance Techniques for High-Performance Computing

123

Editors Thomas Herault University of Tennessee Knoxville, TN USA

Yves Robert Ecole Normale Supérieure de Lyon Lyon France and University of Tennessee Knoxville, TN USA

ISSN 1617-7975 ISSN 2197-8433 (electronic) Computer Communications and Networks ISBN 978-3-319-20942-5 ISBN 978-3-319-20943-2 (eBook) DOI 10.1007/978-3-319-20943-2 Library of Congress Control Number: 2015942754 Springer Cham Heidelberg New York Dordrecht London © Springer International Publishing Switzerland 2015 for Chapters 1, 3, 4 and 5 © Springer International Publishing Switzerland (outside the USA) 2015 for Chapter 2 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, speciﬁcally the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microﬁlms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a speciﬁc statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. Printed on acid-free paper Springer International Publishing AG Switzerland is part of Springer Science+Business Media (www.springer.com)

Preface

Objective The main objective of this monograph is to provide an overview of Fault-Tolerance Techniques for High-Performance Computing (HPC). Resilience has already become a prominent issue on current large-scale platforms. The advent of exascale computers with millions of cores and billion-parallelism

Data Loading...

Fault-Tolerance Techniques for High-Performance Computing

Recommend Documents

Computing Techniques for Robots

Rough-Neural Computing Techniques for Computing with Words

Artificial Intelligence Techniques for Advanced Computing Applications

Biologically Rationalized Computing Techniques For Image Processing Applications

Domain-Specific Language Techniques for Visual Computing: A Comprehensive Study

Segmentation Techniques Using Soft Computing Approach

Soft Computing Techniques in Voltage Security Analysis

Soft Computing Techniques in Engineering Applications

Soft Computing Techniques in Vision Science

Recent Findings in Intelligent Computing Techniques Proceedings of

Recent Findings in Intelligent Computing Techniques Proceedings of

Load Balancing in Grid Computing Using AI Techniques