Genome Annotation

Annotation identifies genes, proteins and pathways encoded by the raw DNA sequence of the genome. Annotation proceeds in three steps. The automatic annotation is refined by manual curation and finally by wet-lab experiments. The chapter contains a list of

  • PDF / 108,807 Bytes
  • 8 Pages / 439.37 x 666.142 pts Page_size
  • 12 Downloads / 196 Views

DOWNLOAD

REPORT


Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4282

2 2.1 2.2 2.3

Genome Annotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4282 Automatic Annotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4282 Manual Editing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4284 Wet-Lab Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4285

3

Research Needs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4286

K. N. Timmis (ed.), Handbook of Hydrocarbon and Lipid Microbiology, DOI 10.1007/978-3-540-77587-4_335, # Springer-Verlag Berlin Heidelberg, 2010

4282

72

Genome Annotation

Abstract: Annotation identifies genes, proteins and pathways encoded by the raw DNA sequence of the genome. Annotation proceeds in three steps. The automatic annotation is refined by manual curation and finally by wet-lab experiments. The chapter contains a list of Web-accessible tools for genome annotation.

1

Introduction

Annotation identifies genes, proteins and pathways encoded by the raw DNA sequence. The newly sequenced genome sequence is typically first processed by automatic annotation pipelines comprised of a variety of software modules (Stothard and Wishart, 2006). If time and resources are available, human experts will then manually curate the annotation. Finally, the annotation is refined and made definitive by wet-lab experiments.

2

Genome Annotation

2.1

Automatic Annotation

The first step in the interpretation of a raw genomic sequence is to use gene-prediction programs like Glimmer (Delcher et al., 1999), GeneMark (Besemer and Borodovsky, 2005) or tRNAscan-SE, which scan the sequence for regions that are likely to encode proteins or functional RNA products. The identified genes are then compared by algorithms such as BLAST (Altschul et al., 1990) to databases of DNA or protein sequences. If the search provides hits above a certain threshold of similarity, the information about their function is transferred to the new sequence. In addition the annotation software tools often generate predictions about chemical and structural properties, genetic organization, metabolic pathways, gene ontologies and phylogenetic information. > Table 1 provides an overview about the most common tools and software packages applicable to the automatic annotation of a newly sequenced genome. User may submit their raw genome sequence data to annotation pipelines such as BASys (Van Domselaar et al., 2005), RAST (Aziz et al., 2008) or the J. Craig Venter Institute Annotation Engine. Alternatively, the user may dec