Comparative Genomics via Wavelet Analysis for Closely Related Bacteria
- PDF / 2,382,545 Bytes
- 8 Pages / 600 x 792 pts Page_size
- 92 Downloads / 190 Views
Comparative Genomics via Wavelet Analysis for Closely Related Bacteria Jiuzhou Song Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Calgary, 3330 Hospital Drive NW, Calgary, Alberta, Canada T2N 4N1 Email: [email protected]
Tony Ware Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Calgary, 3330 Hospital Drive NW, Calgary, Alberta, Canada T2N 4N1 Email: [email protected]
Shu-Lin Liu Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Calgary, 3330 Hospital Drive NW, Calgary, Alberta, Canada T2N 4N1 Email: [email protected]
M. Surette Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Calgary, 3330 Hospital Drive NW, Calgary, Alberta, Canada T2N 4N1 Email: [email protected] Received 26 February 2003; Revised 11 September 2003 Comparative genomics has been a valuable method for extracting and extrapolating genome information among closely related bacteria. The efficiency of the traditional methods is extremely influenced by the software method used. To overcome the problem here, we propose using wavelet analysis to perform comparative genomics. First, global comparison using wavelet analysis gives the difference at a quantitative level. Then local comparison using keto-excess or purine-excess plots shows precise positions of inversions, translocations, and horizontally transferred DNA fragments. We firstly found that the level of energy spectra difference is related to the similarity of bacteria strains; it could be a quantitative index to describe the similarities of genomes. The strategy is described in detail by comparisons of closely related strains: S.typhi CT18, S.typhi Ty2, S.typhimurium LT2, H.pylori 26695, and H.pylori J99. Keywords and phrases: comparative genomics, gene discovery, wavelet analysis, bacterial genome.
1. INTRODUCTION Since the publication of the whole genomic sequence of Haemophilus influenzae [1], the draft genomes of more than 90 bacterial strains have been completely finished. A notable outcome of these genome projects is that at least one third of the genes encoded in each genome have no known or predictable functions. The genome sequencing, while not providing the detailed minutiae of the complete sequences, allows comparisons between genomes to identify insertion, deletion, and transfers that are undoubtedly important in the different phenotype of strains. However, as the level of evolutionary conservation of microbial proteins is rather uniform, a large portion of gene products from each of the sequenced genomes has homologs in distant genomes [2].
The functions of many of these genes may be predicted by comparing the newly sequenced genomes with those of better-studied organisms. This makes comparative genomics a very powerful approach to a better understanding of the genomes and biology of the organisms and to determine what is common and what unique between different species at the genome level, especially on genome analysis and annotation. In ad
Data Loading...