Structural Failure Models for Fault-Tolerant Distributed Computing

Given that faults cannot be prevented in sufficiently complex systems, means of fault tolerance are essential for dependable distributed systems. Designing and evaluating fault-tolerant systems require well-conceived fault models. In the past, theoretical

  • PDF / 2,539,968 Bytes
  • 227 Pages / 419.528 x 595.276 pts Page_size
  • 50 Downloads / 215 Views

DOWNLOAD

REPORT


VIEWEG+TEUBNER RESEARCH Software Engineering Research Herausgeber/Editor: Prof. Dr. Wilhelm Hasselbring

Im Software Engineering wird traditionell ein Fokus auf den Prozess der Konstruktion von Softwaresystemen gelegt. Der Betrieb von Systemen, die kontinuierlich Dienste mit einer geforderten Qualität bieten müssen, stellt eine ebenso große Herausforderung dar. Ziel der Reihe Software Engineering Research ist es, innovative Techniken und Methoden für die Entwicklung und den Betrieb von nachhaltigen Softwaresystemen vor zustellen. Traditionally, software engineering focuses on the process of constructing and evolving software systems. The operation of systems that are expected to continuously provide services with required quality properties is another great challenge. It is the goal of the Series Software Engineering Research to present innovative techniques and methods for engineering and operating sustainable software systems.

Timo Warns

Structural Failure Models for Fault-Tolerant Distributed Computing With a foreword by Prof. Wilhelm Hasselbring

VIEWEG+TEUBNER RESEARCH

Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet at http://dnb.d-nb.de.

Dissertation Universität Oldenburg, 2009

1st Edition 2010 All rights reserved © Vieweg+Teubner Verlag | Springer Fachmedien Wiesbaden GmbH 2010 Editorial Office: Ute Wrasmann | Anita Wilke Vieweg+Teubner Verlag is a brand of Springer Fachmedien. Springer Fachmedien is part of Springer Science+Business Media. www.viewegteubner.de No part of this publication may be reproduced, stored in a retrieval system or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the copyright holder. Registered and/or industrial names, trade names, trade descriptions etc. cited in this publication are part of the law for trade-mark protection and may not be used free in any form or by any means even if this is not specifically marked. Cover design: KünkelLopka Medienentwicklung, Heidelberg Printing company: STRAUSS GMBH, Mörlenbach Printed on acid-free paper Printed in Germany ISBN 978-3-8348-1287-2

Foreword

Despite means of fault prevention such as extensive testing or formal verification, errors inevitably occur during system operation. To avoid subsequent system failures, critical distributed systems, therefore, require engineering of means for fault tolerance. Achieving fault tolerance requires some redundancy, which, unfortunately, is bound to limitations. Appropriate fault models are needed to describe which types of faults and how many faults are tolerable in a certain context. Previous research on distributed systems has often introduced fault models that abstract too many relevant system properties such as dependent and propagating component failures. In this research work, Timo Warns introduces new structural failur