An ant colony algorithm for multiple sequence alignment in bioinformatics

This paper describes a the application of ant colony optimization algorithms, which draw inspiration from the way ants organize themselves in searching for food, to the well-known bioinformatics problem of aligning several protein sequences.

  • PDF / 1,040,753 Bytes
  • 5 Pages / 595.276 x 793.687 pts Page_size
  • 45 Downloads / 253 Views

DOWNLOAD

REPORT


An ant colony algorithm for multiple sequence alignment in bioinformatics Jonathan Moss and Colin G. Johnson * 'Computing Laboratory University of Kent at Canterbury Canterbury, Kent, CT2 7NF, England. C.G.Johnson~ukc . ac.uk

Abstract This paper describes a the application of ant colony optimization algorithms, which draw inspiration from the way ants organize themselves in searching for food, to the well-known bioinformatics problem of aligning several protein sequences.

1 Introduction Swarm intelligence methods are computational techniques inspired by animals such as social insects acting together to solve complex problems. The main application of these techniques has been to combinatorial optimization problems. This paper discusses work-in-progress on the application of swarm intelligence ideas to a bioinformatics problem, viz. aligning multiple protein sequences which are believed to be related. The paper begins with a brief survey of swarm intelligence and the multiple sequence alignment problem. The application of one to the other is then described, and some preliminary results are given both on synthetic problems and on real-world data. 2 Ant colony optimization and swarm intelligence Ant colonies are able to organize their foraging behaviour in a seemingly efficient way without any centralized control [7]. This self-organizing structure is carried out via stigmergic communication, i.e. communication by changing the environment, in this case by laying down pheromone trails. Initially ants have no idea of where food is in the environment, so they wander randomly, leaving a pheromone trail. When an ant finds food it wanders back to the nest. Initially these paths will be arbitrary, but when an ant follows a shorter path it will be able to follow that path more often within the same time period than an ant following a longer path, so there is a positive reinforcement process whereby the shorter paths get stronger. A simple version of this is illustrated in figure 1, where ants have two possible routes from a nest to

D. W. Pearson et al. (eds.), Artificial Neural Nets and Genetic Algorithms © Springer-Verlag/Wien, 2003

I

food Sown

Fig. 1. A simple ant foraging problem. a food source. If two ants set out at the same time, one taking route A and one route B, which is twice as long, then the ant taking A will have travelled back and forth between the food source twice in the same time that the other ant has travelled back and forth once. Therefore there will be a stronger pheromone trail on route A compared to route B. This idea can be effectively scaled up to solving route finding problems such as the TSP, with performance as good as or better than existing heuristics [2, 3]. 3 Multiple sequence alignment Proteins are complex molecules which consist of a long chain of amino acids. The sequence of these amino acids along the chain is specified by transscribing and translating the DNA sequence in the cell. These chains fold up into a complex three dimensional structure. Proteins are the basic building blocks o