Protein Function Prediction for Omics Era

Gene function annotation has been a central question in molecular biology. The importance of computational function prediction is increasing because more and more large scale biological data, including genome sequences, protein structures, protein-protein

  • PDF / 9,800,928 Bytes
  • 316 Pages / 439.37 x 666.142 pts Page_size
  • 99 Downloads / 196 Views

DOWNLOAD

REPORT


Daisuke Kihara Editor

Protein Function Prediction for Omics Era

123

Editor Daisuke Kihara Department of Biological Sciences/ Computer Science Purdue University Hockmyer Hall 249 S. Martin Jischke Drive 47907-2107 West Lafayette IN, USA [email protected]

ISBN 978-94-007-0880-8 e-ISBN 978-94-007-0881-5 DOI 10.1007/978-94-007-0881-5 Springer Dordrecht Heidelberg London New York Library of Congress Control Number: 2011925680 © Springer Science+Business Media B.V. 2011 No part of this work may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, microfilming, recording or otherwise, without written permission from the Publisher, with the exception of any material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)

Preface

Elucidation of protein function has been a central question in molecular biology, genetics, and biochemistry. The importance of computational function prediction is increasing because more and more genome sequences are being determined by genome sequencing projects. Recent advancement of sequencing technologies further achieves surprisingly fast speed for sequencing complete genomes. It is clear that genome sequencing will become a routine in biological and medical studies in very near future. In addition, it is noteworthy that structural genomics projects have been launched for over few years, which are producing an increasing number of protein structures of unknown function. Besides the flood of protein sequences and structures, other types of large scale biological data, including protein–protein interaction data, gene expression data, are awaiting biological interpretation. Thus, the post-genomics era has entered to the second phase, the omics era, when various types of large-scale biological data are generated and referred to each other toward systems level understanding of organisms and life. Obviously function prediction is indispensable for capitalizing the rich sources of the omics data. It has been 20 years since FASTA and BLAST, the most commonly used homology search tools, were developed. As exemplified by the fact that the first complete genome was finished 6 years after the two homology search tools were developed, the circumstance of biological research has dramatically changed since then. The appearance of omics data has brought different needs and sources for function predictions. Conventional use of homology search methods is not necessarily most suitable for analyzing large scale data. For analyzing data which have many genes included, large coverage in function annotation is essential. For biological interpretation of large-scale data, detailed biochemical function assignment to genes is not always necessary. A broad class of function, or low-resolution function, is still helpful to understand functional unit of genes a