Bioinformatics for High Throughput Sequencing

The ultimate goal of bioinformatics is to extract biological knowledge out of the large amount of information generated by the scientific community. The chapters compiled in this volume will allow both biologists and computer scientists to understand the

  • PDF / 3,704,417 Bytes
  • 258 Pages / 439.37 x 666.14 pts Page_size
  • 24 Downloads / 316 Views

DOWNLOAD

REPORT


Naiara Rodríguez-Ezpeleta Ana M. Aransay



Michael Hackenberg

Editors

Bioinformatics for High Throughput Sequencing

Editors Naiara Rodríguez-Ezpeleta Genome Analysis Platform CIC bioGUNE Derio, Bizkaia, Spain [email protected] Ana M. Aransay Genome Analysis Platform CIC bioGUNE Derio, Bizkaia, Spain [email protected]

Michael Hackenberg Computational Genomics and Bioinformatics Group Genetics Department & Biomedical Research Center (CIBM) University of Granada, Spain [email protected]

ISBN 978-1-4614-0781-2 e-ISBN 978-1-4614-0782-9 DOI 10.1007/978-1-4614-0782-9 Springer New York Dordrecht Heidelberg London Library of Congress Control Number: 2011937571 © Springer Science+Business Media, LLC 2012 All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)

Preface

The purpose of this book is to collect in a single volume the essentials of high throughput sequencing data analysis. These new technologies allow performing, at an unprecedented low cost and high speed, a panoply of experiments spanning the sequencing of whole genomes or transcriptomes, the profiling of DNA methylation, and the detection of protein–DNA interaction sites, among others. In each experiment a massive amount of sequence information is generated, making data analysis the major challenge in high throughput sequencing-based projects. Hundreds of bioinformatics applications have been developed so far, most of them focusing on specific tasks. Indeed, numerous approaches have been proposed for each analysis step, while integrated analysis applications and protocols are generally missing. As a result, even experienced bioinformaticians struggle when they have to discern among countless possibilities to analyze their data. This, together with a lack of enough qualified personnel, reveals an urgent need to train bioinformaticians in existing approaches and to develop integrated, “from start to end” software applications to face present and future challenges in data analysis. Given this scenario, our motivation was to assemble a book covering the aforementioned aspects. Following three fundamental introductory chapters, the core of the book focuses on the bioinformatics aspects, presenting a comprehensive review of the methods and programs existing to analyze the raw data obtaine